Questions in topic: "mongodb"https://forums.databricks.com/questions/topics/single/2473.html
The latest questions for the topic "mongodb"How to insert and update large PHP array : MongoDB and PHPhttps://forums.databricks.com/questions/17325/how-to-insert-and-update-large-php-array-mongodb-a.html
<p>I have a loop and each time the loop iterates, it adds a record to an array.</p><p>In my current model, I wait for the loop to finish completely and then insert that array into <a href="https://tekslate.com/insert-document-mongodb/" target="_blank"> MongoDB </a>and have not had problems with that until now.</p><p>Now my array size is starting to exceed 10 MB after the loop and I read that Mongo has a 4MB limit. And I do not want to store the whole array into memory while I wait for the loop to finish.</p><p>Ideally, I'd love to do an update within the loop on the same Mongo Id of the collection so that I do not need to store the array in the memory. However, it seems Mongo will not support if the collection size is over 4MB so I do not know how to go about it.</p><p>I read about GridFS but I do not see a way to insert arrays using it. Any ideas would be much appreciated.</p>mongodbThu, 14 Mar 2019 09:10:59 GMTnianahow do you contact forum admin? My question is waiting for moderator approval?https://forums.databricks.com/questions/16762/how-do-you-contact-forum-admin-my-question-is-wait.html
<p>Hi I am new to databricks</p><p>I posted a question a couple of days ago. I am still waiting for the moderator to approve it. </p><p>Any idea how I contact the admin?</p><p>Is there an unmoderated forum?</p><p>I noticed several people are following my question like @cfregly are they the admin/moderators?</p><p>Does databricks offer some sort of technical support? We are signed up for the "standard plan"</p><p><a href="https://forums.databricks.com/questions/16716/newbie-trouble-with-mongodb-connector-quick-start.html" target="_blank">https://forums.databricks.com/questions/16716/newbie-trouble-with-mongodb-connector-quick-start.html</a> </p><p>Kind regards</p><p>Andy</p><p></p>mongodbforumsTue, 12 Feb 2019 16:43:40 GMTandynewbie trouble with mongoDB connector quick starthttps://forums.databricks.com/questions/16716/newbie-trouble-with-mongodb-connector-quick-start.html
<p>
Hi </p><p>
I am trying to work through the mongodb sample notebook. I am new to databricks. </p><p>
<a href="https://docs.databricks.com/_static/notebooks/mongodb.html" target="_blank">https://docs.databricks.com/_static/notebooks/mongodb.html</a></p><p>
I created an mongoDB atlas account and free tier cluster and uploaded the sample move lens dataset. </p><p>
I can not connect my notebook. I get the following error</p><pre>import com.mongodb.spark._
val ratings = spark.read.format("com.mongodb.spark.sql.DefaultSource").option("database", "recommendation").option("collection", "ratings").load()
com.mongodb.MongoTimeoutException: Timed out after 30000 ms while waiting for a server that matches com.mongodb.client.internal.MongoClientDelegate$1@810e629. Client view of cluster state is {type=REPLICA_SET, servers=[{address=cluster0-shard-00-01-3p6rj.mongodb.net:27017, type=UNKNOWN, state=CONNECTING, exception={com.mongodb.MongoSocketReadException: Prematurely reached end of stream}}, {address=cluster0-shard-00-00-3p6rj.mongodb.net:27017, type=UNKNOWN, state=CONNECTING, exception={com.mongodb.MongoSocketReadException: Prematurely reached end of stream}}, {address=cluster0-shard-00-02-3p6rj.mongodb.net:27017, type=UNKNOWN, state=CONNECTING, exception={com.mongodb.MongoSocketReadException: Prematurely reached end of stream}}]
</p><p>I suspect the issue is I did not add databricks to the whitelist correctly.</p><p>
I do not understand the directions</p><p>
<em><strong>Enable Databricks clusters to connect to the cluster by adding the Databricks IPs to the <a href="https://docs.atlas.mongodb.com/setup-cluster-security/#add-ip-addresses-to-the-whitelist" style="background-color: initial;">whitelist in Atlas</a>.</strong></em></p><p>
Were would I find the databricks IP address for my cluster?</p><p>I tried using the ip listed on the master's rest API. </p><p>Any suggestions would be greatly appreciated</p><p>Andy</p>
</p>mongodbMon, 11 Feb 2019 02:00:58 GMTandyHow to perfectly store dataframe using for each writer in scala with map column in mongodb?https://forums.databricks.com/questions/16445/how-to-perfectly-store-dataframe-using-for-each-wr.html
<p>Hi...Case class is needed to define the data frame that is to be stored in <a href="https://tekslate.com/mongodb-database/" target="_blank">MongoDB</a>...</p>mongodbThu, 24 Jan 2019 09:04:29 GMTnianaQuery to Mongo using Spark (Java)https://forums.databricks.com/questions/15437/query-to-mongo-using-spark-java.html
<p>Hi guys!</p><p>m working with a project in Eclipse using SPark SQL and Mongo in Java.
</p><p>In
my database of Mongo i have like 50.000.000 documents stored. I just
want to get the data what i need, for example the documents that are
between two dates.</p><p>How can i do this with Apache SPark + Mongo in Java?</p>spark sqljavaapache-sparkmongodbapache-mongoWed, 10 Oct 2018 14:14:40 GMTJCantonaCan I use DDL statements in MongoDB BI Connector?https://forums.databricks.com/questions/13399/can-i-use-ddl-statements-in-mongodb-bi-connector.html
<p>Can I use DDL statements (CREATE/DROP/ALTER table)? In version 2.3</p><p>I need to execute ddl statements using MySql provider to create/alter/drop mongo database/collection in C#. Is it possible?</p><p>I am trying to create new database using MySql client, but error ocurred:</p><pre>CREATE DATABASE db1;</pre><p><img src="/storage/attachments/694-createdb.png"></p>mongodbmongodb-bi-connectorWed, 21 Feb 2018 10:56:46 GMTsjahongirMongoDB Spark Connector py4j.protocol.Py4JJavaError: An error occurred while calling o50.loadhttps://forums.databricks.com/questions/12223/mongodb-spark-connector-py4jprotocolpy4jjavaerror.html
<p>I have been able to load this MongoDB database before, but am now receiving an error I haven't been able to figure out.</p><p>Here is how I start my Spark session:</p><pre> spark = SparkSession.builder \
.master("local[*]") \
.appName("collab_rec") \
.config("spark.mongodb.input.uri",
"mongodb://127.0.0.1/example.collection") \
.config("spark.mongodb.output.uri",
"mongodb://127.0.0.1/example.collection") \
.getOrCreate() </pre><p>I run this script so that I can interact with spark through ipython wich loads the mongo spark connector package:</p><pre> #!/bin/bash
export PYSPARK_DRIVER_PYTHON=ipython
${SPARK_HOME}/bin/pyspark \
--master local[4] \
--executor-memory 1G \
--driver-memory 1G \
--conf spark.sql.warehouse.dir="file:///tmp/spark-warehouse" \
--packages com.databricks:spark-csv_2.11:1.5.0 \
--packages com.amazonaws:aws-java-sdk-pom:1.10.34 \
--packages org.apache.hadoop:hadoop-aws:2.7.3 \
--packages org.mongodb.spark:mongo-spark-connector_2.11:2.0.0\ </pre><p>Spark loads fine and it appears the package is loading correctly as well. Here is how I attempt to load that database into a dataframe:</p><pre>df = spark.read.format("com.mongodb.spark.sql.DefaultSource").load() </pre><p>However, on that line, I receive the following error:</p><pre>py4j.protocol.Py4JJavaError: An error occurred while calling o50.load. : org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 0.0 failed 1 times, most recent failure: Lost task 0.0 in stage 0.0 (TID 0, localhost, executor driver): java.lang.NoSuchMethodError: org.apache.spark.sql.catalyst.analysis.TypeCoercion$.findTightestCommonTypeOfTwo()Lscala/Function2; </pre><p>From what I can see through the following documentation/tutorial I am attempting to load the dataframe correctly: https://docs.mongodb.com/spark-connector/master/python-api/</p><p>I am using Spark 2.2.0</p><p>Note that I have been able to replicate this error on both my mac and linux through AWS.</p>
</pre>pythonspark 2.0mongodbWed, 16 Aug 2017 02:36:19 GMTMicah Shanksload data from mongoDB to Sparkhttps://forums.databricks.com/questions/11418/load-data-from-mongodb-to-spark.html
<p>HI guys i am using apache spark 2.0.0 withh scala ide , i want to read data from mongoDB to Apache spark as a dataframe , i added mongo-spark-connector_2.11-2.0.0 .The version of MongoDB i used is 3.0.54 . Here is the code i put and i got the error in the image below please any help will be appreciated Thanks</p><p><img src="/storage/attachments/454-mongodberror.png"></p><pre>import org.apache.spark.sql.SparkSession
import org.apache.spark.sql.SQLContext
import com.mongodb.spark.sql._
import com.mongodb.spark._
import org.bson.Document
import com.mongodb.spark.config._
object Mongo extends App {
try {
val sparkSession = SparkSession.builder().master("local").getOrCreate()
def makeMongoURI(uri:String,database:String,collection:String) = (s"${uri}/${database}.${collection}")
val mongoURI = "mongodb://127.0.0.1:27017"
val Conf = makeMongoURI(mongoURI,"io","thing")
val readConfigintegra: ReadConfig = ReadConfig(Map("uri" -&gt; Conf))
// Uses the ReadConfig
val df3 = sparkSession.sqlContext.loadFromMongoDB(ReadConfig(Map("uri" -&gt; "mongodb://127.0.0.1:27017/io.thing")))
df3.printSchema()
} catch {
case t: Throwable =&gt; t.printStackTrace() // TODO: handle error
println(t.getMessage)
}
}
</pre>sparkspark sqlscala sparkmongodbWed, 19 Apr 2017 09:23:58 GMTzoro07500How to get mongodb connectivity in databricks platformhttps://forums.databricks.com/questions/9633/how-to-get-mongodb-connectivity-in-databricks-plat.html
<p>I have a job that uses mongodb as the source of data.I am not able to connect to the mongo server.<br>I've already installed the mongo packages in my cluster ,still I an not able to get the connectivity.</p><p>https://github.com/Stratio/spark-mongodb/blob/master/doc/src/site/sphinx/First_Steps.rst </p><p>this is the blog that I followed to connect to a mongo sever in local machine.<br> </p><p><br><br> </p>databricksmongodbTue, 06 Sep 2016 06:31:36 GMTrachuri.anilHow to connect to MongDB Database in Python Notebook?https://forums.databricks.com/questions/9603/how-to-connect-to-mongdb-database-in-python-notebo.html
<p>I have such information:</p><p>mongo_host = "46.101.175.29"<br> mongo_port = 27017<br> mongo_db = "master"<br> mongo_user = "****************"<br> mongo_password = "*****************"</p><p>I can easy connect on my PC to the database using python library PyMongo but I can't find information how to do it in DataBricks Python Notebook. <br>Could you please answer how to do it?</p>pythonnotebookmongodbdatabaseconnectorsTue, 30 Aug 2016 14:56:11 GMTIgorSizonHow to configure an external package into Apache Spark?https://forums.databricks.com/questions/9531/how-to-configure-an-external-package-into-apache-s.html
<p>I am building a python script executed using <strong>spark-submit</strong> command to retrieve data from<strong>MongoDB</strong> collection and process fetched data to generate analytics. I am utilizing MongoDB Spark connector to query a MongoDB collection using <strong>--packages</strong> option.</p><p>But I need to configure package into Apache spark and execute python script using spark submit command without <strong>--packages</strong> option.</p><p>Please suggest me an appropriate solution</p>pysparkmongodbspark-packagesMon, 22 Aug 2016 10:09:56 GMTrporwalHow to execute python script using spark submit in php?https://forums.databricks.com/questions/9467/how-to-execute-python-script-through-spark-submit.html
<p>I am working on an PHP application using MongoDB as a database platform. Collection in MongoDB has massive volume of data .Hence I have opted for Apache Spark to generate analytics.</p><p>I need to execute spark-submit command from PHP application.But its execution doesn't return any output to PHP application.</p><p>Following is code snippet</p><pre>$result=shell_exec('./bin/spark-submit examples/src/main/python/pi.py');
print_r($result);</pre><p>Please suggest me an appropriate solution</p>pysparkapache sparkmongodbphpspark-mongoTue, 16 Aug 2016 07:31:40 GMTrporwalFrom Webinar Databricks Data Pipelines: Can we use Databricks and Apache Spark as an "Operational Data Store"?https://forums.databricks.com/questions/9445/from-webinar-databricks-data-pipelines-can-we-use.html
<p>Meaning data ingested as batches, incremental when user update the previously loaded data.</p>webinarmongodboperational data storeFri, 12 Aug 2016 18:01:15 GMTdave_wangHow to build Spark data frame with filtered records from MongoDB?https://forums.databricks.com/questions/9403/how-to-build-spark-data-frame-with-filtered-record.html
<p>My application has been built utilizing MongoDB as a platform. One collection in DB has massive volume of data and have opted for apache spark to retrieve and generate analytical data through calculation. I have configured <a href="https://docs.mongodb.com/spark-connector/getting-started/">Spark Connector for MongoDB</a> to communicate with MongoDB. I need to query MongoDB collection using <strong>pyspark</strong> and build a dataframe consisting of resultset of mongodb query. Please suggest me an appropriate solution to it.</p>pysparkmongodbTue, 09 Aug 2016 10:02:04 GMTrporwalworking with mongodb and spark 2.0https://forums.databricks.com/questions/9376/working-with-mongodb-and-spark-20.html
<p>I am trying to read and write dataframes from/to mongodb.</p><p>I saw that the best way is to use https://docs.mongodb.com/spark-connector/getting-started/</p><p>How would I configure this inside the databricks cloud?</p><p>What would I need to do to use it with spark 2.0?</p>spark 2.0mongodbmongoSun, 07 Aug 2016 15:02:24 GMTassaf_mendelson