Columnstore Indexes And ML Services

I expect not just a couple of rows to be sent over for the Machine Learning Services, but huge tables with million of rows, that also contain hundreds of columns, because this kind of tables are the basis for the Data Science and Machine Learning processes.While of course we are focusing here on rather small part of the total process (just the IO between SQL Server relational Engine and the Machine Learning Services), where the analytical process itself can take hours, but the IO can still make a good difference in some cases.I love this improvement, which is very under-the-hood, but it will help a couple of people to make a decision of migrating to the freshly released SQL Server 2017 instead of the SQL Server 2016.

I haven’t quite taken advantage of this yet (just moved to 2017 but still in 130 compatibility mode) but fingers crossed that I’ll see those improvements.

Related Posts

Joe Obbish has another post looking at sub-optimal columnstore index performance: It is possible to see a scalability bottleneck in the form of high average wait time for the RESERVED_MEMORY_ALLOCATION_EXT wait if a highly concurrent workload is run on a server that consumes memory with batch mode operators. I believe that the severity of the bottleneck depends […]

Fisseha Berhane looked at Extreme Gradient Boosting with R and now covers it in Python: In both R and Python, the default base learners are trees (gbtree) but we can also specify gblinear for linear models and dart for both classification and regression problems. In this post, I will optimize only three of the parameters […]