This is a bit different post in the series about the data mining and machine learning algorithms. This time I am honored and humbled to announce that my fourth Pluralsight course is alive. This is the Data Mining Algorithms in SSAS, Excel, and R course. Read More...

With the K-Means algorithm, each object is assigned to exactly one cluster. It is assigned to this cluster with a probability equal to 1.0. It is assigned to all other clusters with a probability equal to 0.0. This is hard clustering. Instead of distance, Read More...

Hierarchical clustering could be very useful because it is easy to see the optimal number of clusters in a dendrogram and because the dendrogram visualizes the clusters and the process of building of that clusters. However, hierarchical methods don’t Read More...

Clustering is the process of grouping the data into classes or clusters so that objects within a cluster have high similarity in comparison to one another, but are very dissimilar to objects in other clusters. Dissimilarities are assessed based on the Read More...

Data mining is the most advanced part of business intelligence. With statistical and other mathematical algorithms, you can automatically discover patterns and rules in your data that are hard to notice with on-line analytical processing and reporting. Read More...

We are close to the publishing day of the T-SQL Querying book. Of course, like always in this series, the main author of the book is Itzik Ben-Gan. This time, besides me, Adam Machanic and Kevin Farlee are the coauthors. The information I want to share Read More...

So the event is over. I think I can say for all three organizers, Mladen Prajdić , Matija Lah , and me , that we are tired now. However, we are extremely satisfied. It was a great event. First few numbers and comparison with SQL Saturday #274 , the first Read More...

I am proud and glad I can announce two top pre-conference seminars at the PASS SQL Saturday #356 Slovenia conference. The speakers and the seminars titles are: Stacia Misner - Power Up Your Data with Excel and Power BI Kevin Boles - Tune Like A Guru! Read More...

Back home from SQLBits XII , back to normal life. However, I am already missing the conference. I have spoken with quite a few other speakers, and our conclusion is, as my friend Niko Neugebauer said: this conference is simply second to none. Great content, Read More...

Pure success! I could simply stop here. However, I want to mention again everybody involved in this, and also some who were unfortunately missing. First of all, PASS is the organization that defined SQL Saturdays. And apparently the idea works I have Read More...

SQL Saturday #274 Slovenia is full (150 registered attendees) for more than a week There are still some people in the waiting list. We sent an e-mail to registered attendees asking them to unregister if they already know they can’t make it. I have to Read More...

This is the third part of the fraud detection whitepaper. You can find the first part and the second part in my previous blog posts about this topic. Data Preparation The problem of credit card fraud detection is not trivial. With every transaction processed, Read More...

This is the second part of the fraud detection whitepaper. You can find the first part in my previous blog post about this topic. My Approach to Data Mining Projects It is impossible to evaluate the time and money needed for a complete fraud detection Read More...

While working on different fraud detection projects, I developed my own approach to the solution for this problem. In my PASS Summit 2013 session I am introducing this approach. I also wrote a whitepaper on the same topic, which was generously reviewed Read More...

Many companies or organizations do regular data cleansing. When you cleanse the data, the data quality goes up to some higher level. The data quality level is determined by the amount of work invested in the cleansing. As time passes, the data quality Read More...