Member Search

Problem Statement

Stunting affects more than one in four children worldwide. Children with stunted growth have an increased risk of early death, higher burden of disease, compromised physical capacities, and diminished cognitive development. This can reduce the productivity of an entire generation.

The roots of stunted growth begin in the womb, leading to low birth weight infants entering the world at a deficit. Being able to predict, early in pregnancy, whether a child will have a low birth weight can help initiation of interventions leading to healthy live births. We need, then, to search not only for causes of low birth weight but also for methods for prediction of preterm birth outcomes (preterm babies are born before they spend the required 9 months in the womb).

Our goal is to determine a combination of early measures that would be a good predictor for birth weight. In pursuit of this goal, we have collected time series data from ultrasounds on pregnant mothers. We would like you to use this data to predict a child’s birth weight and birth date (days from pregnancy start).

You may download the learning data set from here. The format for the data in the data set is a csv with details provided below:

For each fetus given sex, status, and multiple ultrasound measurements(columns 5-12) during the pregnancy (time being the variable t.ultsnd). The data from the repeated ultrasounds provides a small time series that can be used for predicting the birth weight and day. More specifically each fetus has 6 ultrasounds done at regular intervals. For almost all IDs, the first ultrasound only one of 8 possible measurements is noted. For each remaining ultrasound each of the remaining 7 measurements are noted almost every time (there are a few cases with missing values). An example of the measurements for a single fetus is shown below.

In the String[] trainingData, each String states a record of some fetus, and has 14 tokens, comma-separated, in the same order as described above in the table. The format of testingData is almost same as the trainingData. The only difference is that the last two columns (the weight and the birth days) are missing. The datas with same IDs are consecutive. The returned double[] should contain the corresponding predictions for birthday (pregnancy duration) and weight of the fetus, for each ID, in numerical order by ID. More specifically, elements 0 and 1 represent the first fetus’s birthday and weight, elements 2 and 3 the second fetus’s birthday and weight, and so on. The length of the return array equals to the twice of the number of tested fetuses.

As an example, if the testing data contains several rows each for IDs 13, 4, and 9, then the return value should have six elements: {b4, w4, b9, w9, b13, w13}.

NOTE: All data values are normalized between 0 and 1 as part of data obfuscation requirements.

Notes on Data Set Generation

The full data set contains approximately 28,000 lines, covering just over 4800 ID values.

The full data set is divided into 20% for example tests, 30% for provisional tests, and 50% for system tests. All data belonging to the same ID is placed in the same data set.

For each test, approximately 66% of the data (from that segment) is selected for training, and the remainder for testing.

For provisional tests, all example data is also added to the training set.

For system tests, all example and provisional test data is also added to the training set.

Definition

Class:

ChildStuntedness

Method:

predict

Parameters:

String[], String[]

Returns:

double[]

Method signature:

double[] predict(String[] training, String[] testing)

(be sure your method is public)

Examples

0)

Seed: 1

1)

Seed: 2

2)

Seed: 3

3)

Seed: 4

4)

Seed: 5

5)

Seed: 6

6)

Seed: 7

7)

Seed: 8

8)

Seed: 9

9)

Seed: 10

This problem statement is the exclusive and proprietary property of TopCoder, Inc. Any unauthorized use or reproduction of this information without the prior written consent of TopCoder, Inc. is strictly prohibited. (c)2010, TopCoder, Inc. All rights reserved.