tag:support.flying-sphinx.com,2011-01-05:/discussions/problems/26921-indexing-failingFlying Sphinx: Discussion 2018-03-11T00:19:00Ztag:support.flying-sphinx.com,2011-01-05:Comment/448293722018-03-02T15:19:22Z2018-03-02T15:19:25ZIndexing failing<div><p>Hi, I have a couple of indexes which are failing</p>
<pre>
<code>indexing index 'issue_core'...
ERROR: index 'issue_core': sql_range_query: ERROR: integer out of range
indexing index 'tracker_core'...
ERROR: index 'tracker_core': raw_hits: write error: 474119 of 1048450 bytes written.</code>
</pre>
<p>For the first error, I tried to solve it by using big_document_ids, as you can see here<br>
<a href="https://github.com/bountysource/core/blob/master/app/indices/issue_index.rb">https://github.com/bountysource/core/blob/master/app/indices/issue_...</a></p>
<p>However, the problem persists.</p>
<p>For the second error, would appreciate some help diagnosing this.</p>
<p>Setup:-<br></p>
<pre>
<code>gem 'thinking-sphinx', '~&gt; 3.1.3'
gem 'flying-sphinx', '~&gt; 1.0.0'<br><br>
<br>Sphinx 2.1.4-id64-release (rel21-r4421)
herokuapp - bs-prod</code>
</pre>
<p>Thanks for your help!</p></div>You Jing Wongtag:support.flying-sphinx.com,2011-01-05:Comment/448293722018-03-03T00:22:30Z2018-03-03T00:22:30ZIndexing failing<div><p>Hi there,</p>
<p>I&#8217;ve just been looking into this. Regarding the first issue: the integer out of range error is often due to timestamps rather than primary keys - Sphinx stores timestamps as 32-bit integers (based on UNIX timestamps), so anything that is beyond the year 2106 is going to cause problems. The only timestamp in your issue index is remote_created_at - do you allow human-entered values for that column? And are there any values that are 80+ years in the future? I'm guessing there might be some typos that you'll need to manually fix.</p>
<p>Regarding the second issue&#8230; it looks like you&#8217;re currently over the plan limit for the amount of data you&#8217;re dealing with, and are running out of disk space. As a temporary change, removing the <code>:infixes</code> option from the name field or increasing the <code>:min_infixes_len</code> setting should help reduce the amount of disk space required.</p>
<p>Also, within your tracker index, you may be able to improve the indexing speed, by removing the following attribute:</p>
<pre>
<code>has '(select count(*) from issues where tracker_id=trackers.id)', as: :issue_count, type: :integer</code>
</pre>
<p>And replacing it with these two lines:</p>
<pre>
<code>joins :issues
has &#8220;COUNT(issues.id)&#8221;, as: :issue_count, type: :integer</code>
</pre>
<p>Certainly, test that change locally first :)</p>
<p>Let me know how you go with all of this - certainly keen to get it all working with you!</p></div>Pat Allantag:support.flying-sphinx.com,2011-01-05:Comment/448293722018-03-05T00:51:56Z2018-03-05T00:51:57ZIndexing failing<div><p>Thank you.</p>
<p>I've found the dates that are invalid. Thank you for your help! :)</p></div>You Jing Wongtag:support.flying-sphinx.com,2011-01-05:Comment/448293722018-03-05T00:59:05Z2018-03-05T00:59:05ZIndexing failing<div><p>Good to know - hopefully indexing's more reliable now :)</p></div>Pat Allantag:support.flying-sphinx.com,2011-01-05:Comment/448293722018-03-08T16:58:19Z2018-03-08T16:58:20ZIndexing failing<div><p>Hi again,</p>
<p>I've fixed all the remote_created_at dates such that it's always less than year 2106.</p>
<p>I've also checked all the other values, and it doesn't seem like there is anything else that would trigger this error.</p>
<p>Do you have any idea what I can do to debug this better?</p></div>You Jing Wongtag:support.flying-sphinx.com,2011-01-05:Comment/448293722018-03-09T13:57:00Z2018-03-09T13:57:00ZIndexing failing<div><p>It seems the primary problem is disk space. So, I&#8217;ve temporarily shifted you over to a new server, processed the indices, and they&#8217;re <em>still</em> using all of the server&#8217;s disk space&#8230; (which is around 20GB - far more than the 5GB allowed by the plan you&#8217;re on).</p>
<p>We&#8217;re going to need to find a better long-term solution!</p>
<p>There&#8217;s two parts to this. First off: it&#8217;s looking like you&#8217;re currently using even more data than the next plan level (15GB for $300/month on a dedicated machine). These dedicated machines are not generally available (hence the plan&#8217;s not listed publicly), but I can fire one up for you if you definitely want to go down that path. If you feel the data&#8217;s only going to grow further, then I can consider what the options could be for a larger server and how much that might cost.</p>
<p>To get your disk usage under control at least for the short term, I would recommend you remove the infix setting from your Tracker index. This should reduce the size of that index significantly (and it seems it&#8217;s the largest), though it will mean partial/wildcard matches will not work.</p>
<p>Something that may also help, though likely not dramatically, is upgrading Sphinx. You&#8217;re using 2.1.4, and I recommend upgrading to 2.2.11 - this is controlled in your <code>config/thinking_sphinx.yml</code> file, but changing will require running the rebuild task, and that means Sphinx will be down for the full time it takes the indices to be reprocessed (which, going by the most recent run, is at least 4 hours). Also, if the indexing fails (like it has been recently!) then the Sphinx daemon may not be able to boot. So maybe this comes after getting the indices processing correctly on 2.1.4 first before considering such a change.</p>
<p>It does also seem like there might still be some invalid dates, and I&#8217;ll look into that over the coming days.</p>
<p>If any of this doesn&#8217;t make sense, or if you have further questions, do let me know.</p>
<p>Kind regards,</p>
<p>&#8212; Pat</p></div>Pat Allantag:support.flying-sphinx.com,2011-01-05:Comment/448293722018-03-11T00:18:59Z2018-03-11T00:18:59ZIndexing failing<div><p>I've spent some time over the past day trying to figure out that integer-out-of-range issue, and not having much luck. The date values are indeed fine, and I don't see anything obvious about how the other attribute values could cause such a problem (all the integers are clearly less than what can be held within a 32-bit integer).</p>
<p>However, one thing that may help improve the processing speed for the Issue index: I realise you've got the languages association in Issue set up to go through tracker, but in the index it's better to be more explicit. The change:</p>
<pre>
<code>has languages(:id), :as =&gt; :language_ids
# becomes:
has tracker.languages.id, :as =&gt; :language_ids</code>
</pre>
<p>The reason for this is to ensure TS generates just <em>one</em> join to the trackers table (it's currently generating two). This may be fixed in later versions of TS (you're using quite an old version!), but this workaround will be more reliable.</p>
<p>Also, I've removed the symbol for :id - that was only needed for TS v2 and older (but the behaviour should remain the same, so it's not a big deal, and certainly not related to the problems at hand).</p></div>Pat Allan