Posts tagged GitHub

Tag: GitHub

Feed: R-bloggers. Author: Thinking inside the box. [This article was first published on Thinking inside the box , and kindly contributed to R-bloggers]. (You can report issue about the content on this page here) Want to share your content on R-bloggers? click here if you have a blog, or here if you don't. A new package of mine arrived on CRAN yesterday, having been uploaded a few days prior on the weekend. It extends the most excellent (and very minimal / zero depends) unit testing package tinytest by Mark van der Loo with the very clever and well-done diffobj package ... Read More

Feed: R-bloggers. Author: AbdulMajedRaja RS. REGEX is that thing that scares everyone almost all the time. Hence, finding some alternative is always very helpful and peaceful too. Here’s a nice R package thst helps us do REGEX without knowing REGEX. REGEX This is the REGEX pattern to test the validity of a URL: ^(http)(s)?(://)(www.)?([^ ]*)$ A typical regular expression contains — Characters ( http ) and Meta Characters ([]). The combination of these two form a meaningful regular expression for a particular task.So, What’s the problem? Remembering the way in which characters and meta-characters are combined to create a meaningful regex is ... Read More

Feed: R-bloggers. Author: R on The broken bridge between biologists and statisticians. Let’s imagine a field experiment, where different genotypes of khorasan wheat are to be compared under different nitrogen (N) fertilisation systems. Genotypes require bigger plots, with respect to fertilisation treatments and, therefore, the most convenient choice would be to lay-out the experiment as a split-plot, in a randomised complete block design. Genotypes would be randomly allocated to main plots, while fertilisation systems would be randomly allocated to sub-plots. As usual in agricultural research, the experiment should be repeated in different years, in order to explore the environmental variability ... Read More

Feed: R-bloggers. Author: R Views. [This article was first published on R Views, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here) Want to share your content on R-bloggers? click here if you have a blog, or here if you don't. R/Medicine 2019 kicked off on Thursday with two outstanding workshops. It was difficult to choose between the two, but fortunately both presenters developed rich sets of materials that are available online. Alison Hill delivered R Markdown for Medicine with an elegant HTML exposition masterfully created to cultivate beginners while still engaging experienced ... Read More

Feed: Planet PostgreSQL. Postgres is the leading feature-full independent open-source relational database, steadily increasing its popularity for the past 5 years. TimescaleDB is a clever extension to Postgres which implements time-series related features, including under the hood automatic partioning, and more.Because he knows how I like investigate Postgres (among other things) performance, Simon Riggs (2ndQuadrant) prompted me to look at the performance of loading a lot of data into Postgres and TimescaleDB, so as to understand somehow the degraded performance reported in their TimescaleDB vs Postgres comparison. Simon provided support, including provisioning 2 AWS VMs for a few days each.SummaryThe ... Read More

Feed: Featured Blog Posts - Data Science Central. Author: Divya Singh. Who should read this blog: Someone who is new to linear regression. Someone who wants to understand the jargon around Linear Regression Code Repository: https://github.com/DhruvilKarani/Linear-Regression-Experiments Linear regression is generally the first step into anyone’s Data Science journey. When you hear the words Linear and Regression, something like this pops up in your mind: X1, X2, ..Xn are the independent variables or features. W1, W2…Wn are the weights (learned by the model from the data). Y’ is the model prediction. For a set of say 1000 points, we have a table with 1000 rows, n ... Read More

Feed: Featured Blog Posts - Data Science Central. Author: Emmanuelle Rieuf. This article was written by Jean-Nicholas Hould. For the last six years, Jean-Nicholas has been working professionally in the field of data science. During those years, he has been doing lots of data engineering, analysis and statistics. I recently came across a paper named Tidy Data by Hadley Wickham. Published back in 2014, the paper focuses on one aspect of cleaning up data, tidying data: structuring datasets to facilitate analysis. Through the paper, Wickham demonstrates how any dataset can be structured in a standardized way prior to analysis. He presents in detail the ... Read More

Feed: R-bloggers. Author: Robin Ryder. This is an attempt at reproducing the analysis of Section 2.7 of Bayesian Data Analysis, 3rd edition (Gelman et al.), on kidney cancer rates in the USA in the 1980s. I have done my best to clean the data from the original. Andrew wrote a blog post to “disillusion [us] about the reproducibility of textbook analysis”, in which he refers to this example. This might then be an attempt at reillusionment… The cleaner data are on GitHub, as is the RMarkDown of this analysis. library(usmap) library(ggplot2) d = read.csv("KidneyCancerClean.csv", skip=4) In the data, the columns ... Read More

Feed: Blog Post – Corporate – DataStax. Author: Sebastian Estevez. Yesterday at ApacheCon, our very own Patrick McFadin announced the public preview of an open source tool that enables developers to run their AWS DynamoDB™ workloads on Apache Cassandra. With the DataStax Proxy for DynamoDB and Cassandra, developers can run DynamoDB workloads outside of AWS (including on premises, in other clouds, and in hybrid configurations). The Big Picture The cloud has changed computing forever, and as cloud services continue to evolve up the stack, the new capabilities they offer developers come with trade-offs. One of the trade-offs developers, architects, and ... Read More

Feed: Planet big data. Author: David Smith. I've been at the EARL Conference in London this week, and as always it's been inspiring to see so many examples of R being used in production at companies like Sainsbury's, BMW, Austria Post, PartnerRe, Royal Free Hospital, the BBC, the Financial Times, and many others. My own talk, A DevOps Process for Deploying R to Production, presented one process for automating the process of building and deploying R-based applications using Azure Pipelines and Azure Machine Learning Service. The talk at EARL wasn't recorded, but you can see the slides here, and also ... Read More