2726 [Thread-18] WARN org.apache.hadoop.mapred.LocalJobRunner - job_local542592213_0001java.lang.Exception: org.kitesdk.morphline.api.MorphlineRuntimeException: org.apache.solr.core.SolrResourceNotFoundException: Can't find resource 'solrconfig.xml' in classpath or '/home/iapima/file:/tmp/hadoop-iapima/mapred/local/1490304732115/07193328-e9c3-454c-8523-4a782f9371e4.solr.zip/conf' at org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:489) at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:549)Caused by: org.kitesdk.morphline.api.MorphlineRuntimeException: org.apache.solr.core.SolrResourceNotFoundException: Can't find resource 'solrconfig.xml' in classpath or '/home/iapima/file:/tmp/hadoop-iapima/mapred/local/1490304732115/07193328-e9c3-454c-8523-4a782f9371e4.solr.zip/conf' at org.kitesdk.morphline.solr.SolrLocator.getIndexSchema(SolrLocator.java:209) at org.apache.solr.hadoop.morphline.MorphlineMapRunner.<init>(MorphlineMapRunner.java:141) at org.apache.solr.hadoop.morphline.MorphlineMapper.setup(MorphlineMapper.java:75) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:142) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341) at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:270) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745)Caused by: org.apache.solr.core.SolrResourceNotFoundException: Can't find resource 'solrconfig.xml' in classpath or '/home/iapima/file:/tmp/hadoop-iapima/mapred/local/1490304732115/07193328-e9c3-454c-8523-4a782f9371e4.solr.zip/conf' at org.apache.solr.core.SolrResourceLoader.openResource(SolrResourceLoader.java:362) at org.apache.solr.core.SolrResourceLoader.openConfig(SolrResourceLoader.java:308) at org.apache.solr.core.Config.<init>(Config.java:117) at org.apache.solr.core.Config.<init>(Config.java:87) at org.apache.solr.core.SolrConfig.<init>(SolrConfig.java:167) at org.kitesdk.morphline.solr.SolrLocator.getIndexSchema(SolrLocator.java:201) ... 11 more

When I checked the party_name_config which was created in step 1, I checked under the sub-dir named conf, and the solrconfig.xml does exist.

Re: Can't find resource 'solrconfig.xml' in classpath

Looks like the error is stating it is trying to find the solrconfig locally and not able to find it.

1. I noticed you are passing hadoop --config /etc/hadoop/conf.cloudera.hdfs ->

try to pass hadoop --config /etc/hadoop/conf.cloudera.yarn as MRIndexer tool is a map reduce job and need that configuration. Make sure to have yarn and solr gateway on the node from where you are trying to run this from.

2. Can you check under zookeeper if you have all configs placed?

Login to dwh-mst-dev02.stor.nccourts.org and do

zookeeper-client

ls /solr

ls /solr/configs

ls /solr/configs/party_name_config

ls /solr/configs/party_name_config/solrconfig.xml

Make sure all this is present under /solr in zookeeper

3. Can you paste the content of ~/search/readCSV.conf? Make sure you have zkHost: dwh-mst-dev02.stor.nccourts.org:2181/solr set in your morphline config.

Re: Can't find resource 'solrconfig.xml' in classpath

I made change suggested to point to /etc/hadoop/conf.cloudera.yarn as suggested. That took care of the earlier

error. When I reran the script, I got the error below.

-------------

Error: java.io.IOException: Batch Write Failure at org.apache.solr.hadoop.BatchWriter.throwIf(BatchWriter.java:239) at org.apache.solr.hadoop.BatchWriter.queueBatch(BatchWriter.java:181) at org.apache.solr.hadoop.SolrRecordWriter.close(SolrRecordWriter.java:275) at org.apache.hadoop.mapred.ReduceTask$NewTrackingRecordWriter.close(ReduceTask.java:550) at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:629) at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:389) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1796) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)Caused by: org.apache.solr.common.SolrException: ERROR: [doc=1966-05-19 10:36:59.373733] unknown field 'file_length'

-----------

It seems not to like the id field which is a string representation of a timestamp.

Re: Can't find resource 'solrconfig.xml' in classpath

The key being a string is not an issue, as there will be no searches based on the timestamp. Is there a way in the morphline to specify that the field is ineeded a string and not a timestamp?

Below is the full stack trace:

Error: java.io.IOException: Batch Write Failure at org.apache.solr.hadoop.BatchWriter.throwIf(BatchWriter.java:239) at org.apache.solr.hadoop.BatchWriter.queueBatch(BatchWriter.java:181) at org.apache.solr.hadoop.SolrRecordWriter.close(SolrRecordWriter.java:275) at org.apache.hadoop.mapred.ReduceTask$NewTrackingRecordWriter.close(ReduceTask.java:550) at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:629) at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:389) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1796) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)Caused by: org.apache.solr.common.SolrException: ERROR: [doc=1966-05-19 10:36:59.365118] unknown field 'file_length' at org.apache.solr.update.DocumentBuilder.toDocument(DocumentBuilder.java:185) at org.apache.solr.update.AddUpdateCommand.getLuceneDocument(AddUpdateCommand.java:78) at org.apache.solr.update.DirectUpdateHandler2.addDoc0(DirectUpdateHandler2.java:238) at org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java:164) at org.apache.solr.update.processor.RunUpdateProcessor.processAdd(RunUpdateProcessorFactory.java:69) at org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(UpdateRequestProcessor.java:51) at org.apache.solr.update.processor.DistributedUpdateProcessor.doLocalAdd(DistributedUpdateProcessor.java:940) at org.apache.solr.update.processor.DistributedUpdateProcessor.versionAdd(DistributedUpdateProcessor.java:1095) at org.apache.solr.update.processor.DistributedUpdateProcessor.processAdd(DistributedUpdateProcessor.java:701) at org.apache.solr.handler.loader.XMLLoader.processUpdate(XMLLoader.java:247) at org.apache.solr.handler.loader.XMLLoader.load(XMLLoader.java:174) at org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRequestHandler.java:99) at org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:74) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135) at org.apache.solr.core.SolrCore.execute(SolrCore.java:2135) at org.apache.solr.client.solrj.embedded.EmbeddedSolrServer.request(EmbeddedSolrServer.java:150) at org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:124) at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:68) at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:54) at org.apache.solr.hadoop.BatchWriter.runUpdate(BatchWriter.java:135) at org.apache.solr.hadoop.BatchWriter$Batch.run(BatchWriter.java:90) at org.apache.solr.hadoop.BatchWriter.queueBatch(BatchWriter.java:180) ... 9 more

Re: Can't find resource 'solrconfig.xml' in classpath

Frankly, I am at a loss. There was some mis-match between my morphline fields and the schema, but I fixed that. There are no columns not accounted for: Here is my schema, I confirmed by getting it from the Solr web interface:

And here is the fields as defined in readCSV.conf:columns : [id,county,year,court_type,seq_num,party_role,party_num,party_status,biz_name,prefix,last_name,first_name,middle_name,suffix,in_regards_to,case_status,row_of_origin]

They are identical. Still same exception. Any other advise is appreciated.