Ideas tagged with machine learning
Some reports of NGOs and anecdotal evidence suggest that child abuse materials (CAMs) share some characteristics extensively. They mostly take place indoor settings, victims' face or genitalia is visible and there are few visual clues about the abuser(s). For the known CAMs, there are methods...
Could one build a service that checks the completeness and quality of documentation of an open source repository? Potentially a group could build up a set of repositories with documentation ratings, which could then be used to train a ML/DL model, which could then be used to provide the servi...
Recent work in density estimation uses a bijection $f : X \to Z$ (e.g. an invertible flow or autoregressive model) and a tractable density $p(z)$ (e.g. [[1]](https://arxiv.org/abs/1410.8516) [[2]](http://www.dmi.usherb.ca/~larocheh/projects_nade.html) [[3]](https://arxiv.org/abs/1410.6460) [[4]...
By Kyle Cranmer, Gilles Louppe
Traditional phylogenetic trees are represented as bifurcating trees, where the leaf nodes represent taxa and the internal nodes represent common ancestors. Bifurcating trees offer advantages of interpreting common ancestors as well as being widely accepted; however, this representation could lim...
By Cole Lyman
Concept drift, a phenomenon where the statistical properties of the target variable change over time, poses a significant challenge in data stream mining. The low amount of real word datasets with concept drift make this challenge harder on many researchers. This brief proposes an approach to ge...
In recent times, genetic disorders have risen to be one of the significant causes of mortality. As we improve our understanding of the human genome, we see that nearly all diseases have a genetic component linked to them. Early diagnosis of these genetic diseases is crucial for successful treatm...
By Ashwin K. Jainarayanan, Nithishwer Mourouganand
Machine learning protocols utilize rewards function during training as a means of tuning parameters toward obtaining desirable outputs from a model. One challenge for the current AI industry is the difficulty of translating real-world utility into reward functions for individual models that are ...
In multidimensional data modeling, dimension reduction is not intuitive. Forward feature selection is usually deceptive. That is, a strongly related feature may have a small correlation coefficient (near zero) to the objective, especially when the target model is nonlinear. Therefore, we suggest...
Rayleigh pitot tube formula is a very basic and commonly used formula in aerodynamics. It gives the rate of the total pressure behind a normal shock wave and the pressure of freestream at a given Mach number. Although the relation is rigorous in theory, it is difficult to understand the changing...
Detecting the potential problems in the code before the product is released can prevent the problems in production and lower the cost of the system operation. The automated code review tools are relying on detecting code patterns that are know to cause problems. This methods are unable to find n...