该读

Join the TensorFlow team as they kick off the 2018 TensorFlow Dev Summit! The TensorFlow Dev Summit brings together a diverse mix of machine learning users from around the world for a full day of highly technical talks, demos, and conversations with the TensorFlow team and community.

I’ve used Python’s textblob classifier to simply classify issues according to assignees from their description and headers. Classified issues used to classify newly created issues and results are recorded to a database. 2019 issues used as training set and %82 assignment accuracy have been achieved. As the training set grows bigger accuracy could be better.

I frequently predict proportions (e.g., proportion of year during which a customer is active). This is a regression task because the dependent variables is a float, but the dependent variable is bound between the 0 and 1. Googling around, I had a hard time finding the a good way to model this situation, so I’ve written here what I think is the most straight forward solution.

Clustering data is the process of grouping items so that items in a group (cluster) are similar and items in different groups are dissimilar. After data has been clustered, the results can be analyzed to see if any useful patterns emerge. For example, clustered sales data could reveal which items are often purchased together (famously, beer and diapers).

In this article, we will go through the evaluation of Topic Modelling by introducing the concept of Topic coherence, as topic models give no guaranty on the interpretability of their output. Topic modeling provides us with methods to organize, understand and summarize large collections of textual information. There are many techniques that are used to obtain topic models. Latent Dirichlet Allocation (LDA) is a widely used topic modeling technique to extract topic from the textual data.

PyTorch 1.0 takes the modular, production-oriented capabilities from Caffe2 and ONNX and combines them with PyTorch's existing flexible, research-focused design to provide a fast, seamless path from research prototyping to production deployment for a broad range of AI projects.

Recently I did some PostgreSQL consulting in the Berlin area (Germany) when I stumbled over an interesting request: How can data be shared across function calls in PostgreSQL? I recalled some one of the other features of PostgreSQL (15+ years old or so) to solve the issue. Here is how it works.

So you have a dataset and you’re about to run some test on it but first, you need to check for normality. Think about this question, “Given my data … if there is a deviation from normality, will there be a material impact my results?”

Given 4 assets’ risk and return as following, what could be the risk-return for any portfolio built with the assets. One may think that all possible values should fall inside the area. But it is possible to go beyond the bond, because combining inversely correlated assets can construct a portfolio with lower risk.

(￣▽￣)

Search the most similar strings against the query in Python 3. State-of-the-art algorithm and data structure are adopted for best efficiency. For both flexibility and efficiency, only set-based similarities are supported right now, including Jaccard and Tversky.