Announcement (2017-05-07): www.ruby-forum.com is now read-only since I
unfortunately do not have the time to support and maintain the forum any
more. Please see rubyonrails.org/community and ruby-lang.org/en/community
for other Rails- und Ruby-related community platforms.

The "integer", "string" and "float" methods are just shorthands for
the column call using that type.
It's also the "new way" (new since Rails 2, not that new now) of
writing migrations. And the "timestamps" will create both a created_at
and also an updated_at column.
-
Maurício Linhares
http://alinhavado.wordpress.com/ (pt-br) |
http://codeshooter.wordpress.com/ (en)

MaurÃ­cio Linhares wrote:
> The "integer", "string" and "float" methods are just shorthands for> the column call using that type.>> It's also the "new way" (new since Rails 2, not that new now) of> writing migrations. And the "timestamps" will create both a created_at> and also an updated_at column.>> -> Maurï¿½cio Linhares> http://alinhavado.wordpress.com/ (pt-br) |> http://codeshooter.wordpress.com/ (en)
Thanks Mauricio - I actually like it better than having to do things a
long way. Anything shorter is better, IMO. I appreciate the
explanation.
I have one more question..
I've created a ruby program that actually parses raw statistics from the
main NCAA web site, which I want to bring into my own database.
So, using the example above.. here's an example of the parser I created:
#== Scraper Version 1.0
#
#*Created By:* _Elricstorm_
#
# _Special thanks to Soledad Penades for his initial parse idea which I
worked with to create the Scraper program.
# His article is located at
http://www.iterasi.net/openviewer.aspx?sqrlitid=wd...
#
require 'hpricot'
require 'open-uri'
# This class is used to parse and collect data out of an html element
class Scraper
attr_accessor :url, :element_type, :clsname, :childsearch, :doc,
:numrows
# Define what the url is, what element type and class name we want to
parse and open the url.
def initialize(url, element_type, clsname, childsearch)
@url = url
@element_type = element_type
@clsname = clsname
@childsearch = childsearch
@doc = Hpricot(open(url))
@numrows = numrows
end
# Scrape data based on the type of element, its class name, and define
the child element that contains our data
def scrape_data
@rows = []
(doc/"#{@element_type}.#{@clsname}#{@childsearch}").each do |row|
cells = []
(row/"td").each do |cell|
if (cell/" span.s").length > 0
values = (cell/"span.s").inner_html.split('<br
/>').collect{ |str|
pair = str.strip.split('=').collect{|val| val.strip}
Hash[pair[0], pair[1]]
}
if(values.length==1)
cells << cell.inner_text.strip
else
cells << values.strip
end
elsif
cells << cell.inner_text.strip
end
end
@rows << cells
end
@rows.shift # Shifting removes the row containing the <th> table
header elements.
@rows.delete([]) # Remove any empty rows in our array of arrays.
@numrows = @rows.length
end
def clean_celldata
@rows[@numrows-1][0] = 120
end
# Print a joined list by row to see our results
def print_values
puts "Number of rows = #{numrows}."
for i in 0..@numrows-1
puts @rows[i].join(', ')
end
end
# This method will be used to further process collected data
def process_values
File.open("testdata.txt", "w") do |f|
for i in 0..@numrows-1
f.puts @rows[i].join(', ')
end
end
puts "Processing completed."
end
end
# In our search we are supplying the website url to parse, the type of
element (ex: table), the class name of that element
# and the child element that contains the data we wish to retrieve.
offensive_rushing =
Scraper.new('http://web1.ncaa.org/mfb/natlRank.jsp?year=2008&rp...,
'table', 'statstable', '//tr')
offensive_rushing.scrape_data
offensive_rushing.clean_celldata
offensive_rushing.print_values
offensive_rushing.process_values
-------------
So, the other question I have is how do I tie in the mechanics of a
regular ruby program into rails? For instance, the ruby program I wrote
requires hpricot..
I just need a bit of guidance (I catch on fast)..
Someone can run the program I included to see how it outputs..

If crawling through the NCAA website is something that you want to
automate, as "crawl daily at 12 pm" you can create a rake task in your
Rails application that calls this code and then put the rake task to
be run as a cron job:
#ncaa_crawler.rake at /lib/tasks
rake :ncaa_crawler => :environment do
#here goes the code that gets the NCAA data
#and saves it into the database
end
The :environment thing tells Rake that it should load the rails
application before executing the task, so all objects defined in your
Rails application will be available, like your ActiveRecord models.
This code you have shown should be placed into a class in your
application and be called on this rake task.
-
MaurÃ­cio Linhares
http://alinhavado.wordpress.com/ (pt-br) |
http://codeshooter.wordpress.com/ (en)

Thanks again Mauricio, I will look into trying this out. Is the rake
task able to be defined periodically by a weekday? For instance, if I
want the job to only pull data on a Monday (is that feasible)?

That's not really related to the rake task, it's a cron config (i'm
guessing that you're on Linux or some kind of OS that has the cron
utility).
You can google about cron and config it the way you want. Then cron
will call your rake task as needed.
-
Maurício Linhares
http://alinhavado.wordpress.com/ (pt-br) |
http://codeshooter.wordpress.com/ (en)