5.
Availability, Scalability and Efficiency <ul><li>… how fast can you go from data to answers? </li></ul><ul><li>Unstructured data needs to be analyzed to make sense. </li></ul><ul><li>Semi-structure data parsed based on spec (or brute force). </li></ul><ul><li>Structured data can be optimized for ad-hoc analysis. </li></ul>

6.
Hadoop / Vertica <ul><li>Distributed processing framework (MapReduce) </li></ul><ul><li>Distributed storage layer (HDFS) </li></ul><ul><li>Vertica can be used as a data source and target for MapReduce </li></ul><ul><li>Data can also be moved between Vertica and HDFS (sqoop) </li></ul><ul><li>Hadoop talks to Vertica via custom Input and Output Formatters </li></ul>

14.
How to get started <ul><li>Get a copy of hadoop from Apache or Cloudera </li></ul><ul><li>Get vertica from www.vertica.com or via Amazon or RightScale or as a VM </li></ul><ul><li>Grab the formatter and Vertica jdbc drivers from vetica.com/MapReduce </li></ul><ul><li>Included in contrib from hadoop 0.21.0 (MR-775) </li></ul><ul><li>Put the jars in hadoop/lib </li></ul><ul><li>Run your Hadoop/Vertica job </li></ul>