Sumendar Karupakala

Data Science Analyst

About me

I have in-depth experience on quantitative research analysis on generating testable hypotheses, formatting and managing large data sets, building, training and evaluating machine learning models using open source tools like R, Python, database query language like MySQL and other visualization tools such as Tableau etc.

Selected Publications

There are tons of great online resources out there we can pick up and learn them to become a master in data science. Here is a comprehensive list of data science course providers along with links to the data science courses.

There is a gap between the demand and supply of big data-savvy professionals and technologists, as we know the data experts will be a scarce and valuable commodity, so to tackle this challenge the education system has been doing tremendous job seamlessly

Overweight and obesity are increasingly prevalent in the general pediatric population. Evidence suggests that children with autism spectrum disorders (ASDs) may be at elevated risk for unhealthy weight. We identify the prevalence of overweight and obesity in a multisite clinical sample of children with ASDs and explore concurrent associations with variables identified as risk factors for unhealthy weight in the general population.

This is an R Markdown Notebook. When you execute code within the notebook, the results appear beneath the code.
Try executing this chunk by clicking the Run button within the chunk or by placing your cursor inside it and pressing Ctrl+Shift+Enter.
hello plot(cars) Add a new chunk by clicking the Insert Chunk button on the toolbar or by pressing Ctrl+Alt+I.
hello1 When you save the notebook, an HTML file containing the code and output will be saved alongside it (click the Preview button or press Ctrl+Shift+K to preview the HTML file).

dplyr - Data Mainpulation Package Intorduction Most of the data scientists spend 70 to 80% of their time on data preparation for a given project also known as wrangling or cleaning or simply we can say data manipulations, so dplyr is one of the most popular package which can help R users to solve on preparing or manipulating the dataset before going for actual analysis or modeling. some of those operations such as selecting required columns, adding a new column, filtering required observations, or even some of the tasks like sorting or aggregating

Projects

Improving conversational use of spoken language is an important goal for many new interventions and treatments for children with neurodevelopmental disorders. However, progress in testing these treatments is limited by the lack of informative outcome measures to indicate whether or not an intervention or treatment is having the desired effect on a child’s conversational use of language (i.e., discourse skills). The goal of this project is to evaluate whether Natural Language Processing methods can be translated into meaningful outcome measure for individuals with a range of neurodevelopmental disorders. This project was recently funded by the National Institute of Deafness and Other Communication Disorders.

The goal of this project is to develop and validate a novel objective measurement tool, the Multi-modal Autism Phenotype Snapshot (MAPS), for use in clinical trials targeting core symptoms of autism. This project was funded by a Catalyst Award from the Oregon Clinical & Translational Research Institute.

The objective of this project is to further understanding of sex differences in the fundamental patterns of behavioral and social functioning relevant to the clinical presentation of ASD. Guided by our previous research, we applied Natural Language Processing based methods to transcripts of natural language samples in order to quantify features of atypical language use in females with ASD.