steve_l
added a comment - 18/Aug/09 16:32
create a wiki page documenting current best practises, symlink tricks, etc.
add the option of using ivy to retrieve artifacts, by patching up the ivy.xml files
deal with any cycles in the dependencies, so allowing ivy to order multiple builds (this is tricky because of testing, we can do some tricks here)
provide an optional main build file (which repository?) to build everything in the right order

Todd Lipcon
added a comment - 18/Aug/09 18:24 deal with any cycles in the dependencies, so allowing ivy to order multiple builds (this is tricky because of testing, we can do some tricks here)
Where might we have cyclic dependencies? It seems to me, if we have such, they should be hunted down and destroyed mercilessly rather than worked around in the build process, right?

Right now I am playing tricks with symlinks to hook up the lib directories, so what I build in one dir is automatically picked up by the adjacent project and documenting what I am doing for the hadoop wiki but its a bit complex.

options

flatten : pull out hadoop-hdfs.run-test-hdfs-with-mr bit and move to a subproject that depends on hadoop-hdfs and hadoop-mapreduce

boostrap via the central repository. Rather than have copies of artifacts in the different bits of SVN, stick some alpha releases of everything up onto the central repository. Then you can use ivy to pull things in, so when I build hdfs the latest version of common gets picked up, and the latest version of mapreduce. If I publish locally, I get the version I ask for, but the default would be to get the last release on the central repo.

I'm coming round in favour of #2, because it helps us debug the publishing process with, say, a fortnightly alpha release of the artifacts (PMC approval still needed, incidentally), so that when the time comes to do real beta releases, the POMs and such like are stable.

steve_l
added a comment - 19/Aug/09 11:21 Hadoop-hdfs depends on hadoop-mapreduce for testing, hence, a cycle.
Right now I am playing tricks with symlinks to hook up the lib directories, so what I build in one dir is automatically picked up by the adjacent project and documenting what I am doing for the hadoop wiki but its a bit complex.
options
flatten : pull out hadoop-hdfs.run-test-hdfs-with-mr bit and move to a subproject that depends on hadoop-hdfs and hadoop-mapreduce
boostrap via the central repository. Rather than have copies of artifacts in the different bits of SVN, stick some alpha releases of everything up onto the central repository. Then you can use ivy to pull things in, so when I build hdfs the latest version of common gets picked up, and the latest version of mapreduce. If I publish locally, I get the version I ask for, but the default would be to get the last release on the central repo.
I'm coming round in favour of #2, because it helps us debug the publishing process with, say, a fortnightly alpha release of the artifacts (PMC approval still needed, incidentally), so that when the time comes to do real beta releases, the POMs and such like are stable.

It's not actually a cycle, since tests are layered after. Each can be built independently. They can't be fully tested independently, but that's different. Perhaps we should separate tests that require mapreduce from those that do not.

A simple lo-tek solution might be to reference ../mapreduce/build/ in the mapreduce test classpath and require that developers who wish to run those tests check things out in sibling directories. We could even then have single build target that builds and tests everything.

Doug Cutting
added a comment - 20/Aug/09 19:24 It's not actually a cycle, since tests are layered after. Each can be built independently. They can't be fully tested independently, but that's different. Perhaps we should separate tests that require mapreduce from those that do not.
A simple lo-tek solution might be to reference ../mapreduce/build/ in the mapreduce test classpath and require that developers who wish to run those tests check things out in sibling directories. We could even then have single build target that builds and tests everything.

When I tried to bump up the artifact version number by way of a shared build.properties file, hdfs was not happy, as in "refuses to build the base JAR not happy". Therefore, a cycle exists in the jar build process, even if the dependencies only come together at test time.

steve_l
added a comment - 21/Aug/09 12:28 When I tried to bump up the artifact version number by way of a shared build.properties file, hdfs was not happy, as in "refuses to build the base JAR not happy". Therefore, a cycle exists in the jar build process, even if the dependencies only come together at test time.
init:
[mkdir] Created dir: /Users/slo/Java/Hadoop/lifecycle/hadoop-hdfs/build/classes
[mkdir] Created dir: /Users/slo/Java/Hadoop/lifecycle/hadoop-hdfs/build/src
[mkdir] Created dir: /Users/slo/Java/Hadoop/lifecycle/hadoop-hdfs/build/webapps/hdfs/WEB-INF
[mkdir] Created dir: /Users/slo/Java/Hadoop/lifecycle/hadoop-hdfs/build/webapps/datanode/WEB-INF
[mkdir] Created dir: /Users/slo/Java/Hadoop/lifecycle/hadoop-hdfs/build/webapps/secondary/WEB-INF
[mkdir] Created dir: /Users/slo/Java/Hadoop/lifecycle/hadoop-hdfs/build/ant
[mkdir] Created dir: /Users/slo/Java/Hadoop/lifecycle/hadoop-hdfs/build/test
[mkdir] Created dir: /Users/slo/Java/Hadoop/lifecycle/hadoop-hdfs/build/test/classes
[mkdir] Created dir: /Users/slo/Java/Hadoop/lifecycle/hadoop-hdfs/build/test/extraconf
[touch] Creating / var /folders/6j/6jD1CUYiGs43jrHUr7BepU+++TI/-Tmp-/null2099585006
[delete] Deleting: / var /folders/6j/6jD1CUYiGs43jrHUr7BepU+++TI/-Tmp-/null2099585006
[copy] Copying 2 files to /Users/slo/Java/Hadoop/lifecycle/hadoop-hdfs/build/webapps
BUILD FAILED
/Users/slo/Java/Hadoop/lifecycle/hadoop-hdfs/build.xml:259: src '/Users/slo/Java/Hadoop/lifecycle/hadoop-hdfs/lib/hadoop-mapred-0.21.0-alpha-15.jar' doesn't exist.
I've documented what I've done to get the build working:
http://wiki.apache.org/hadoop/BuildingHadoopFromSVN
This has
symbolic links to hook up the files. This is why you should be using a unix to build on.
a boot process where you don't flip the version marker on mapreduce until you've build hdfs.
That wiki entry documents what we have today, its the starting point to what we have to do to simplify things.
We could have a shared base build.xml used by all projects, externally pulled in by -hdfs and -mapreduce
I would like to use ivy to glue together stuff locally and remotely. For this to work we need at least one alpha release of 0.21 up in the big ibiblio repository
the hdfs build could be tweaked to only bail out at test-time if the mapred JAR is missing, because that is the only time it is needed.

steve_l
added a comment - 28/Aug/09 11:23 having played with this locally,
1. the unzip is only needed to get at the webapps/static content. This content could be moved into common
2. the JARs are only needed to compile and test hdfs-with-mr
It's fairly easy to skip all of these if the JARs arent found. The dependencies are still there, but you can
at least build the JARs from scratch