If this is your first visit, be sure to
check out the FAQ by clicking the
link above. You may have to register
before you can post: click the register link above to proceed. To start viewing messages,
select the forum that you want to visit from the selection below.

Why?

I am doing a trial assignment on web scraping. I m a fresher. Using Web harvest (Java) API i am able to extract static content. But some data are enclosed inside javascript functions and html element. Need some guidance. Thanks in advance for helping.

I am able to extract the contents "Hai" and "Hello". But i am unable to extract the contents "yyyyyy" ,"3435534". Because the contents are present within html tag. Currently i am using Web harvesting API for extracting the contents from website. This API gives the result after filtering the html elements. So that i am unable to extract html attributes value.

I am able to extract the contents "Hai" and "Hello". But i am unable to extract the contents "yyyyyy" ,"3435534". Because the contents are present within html tag. Currently i am using Web harvesting API for extracting the contents from website. This API gives the result after filtering the html elements. So that i am unable to extract html attributes value.

I think, there is no ajax call. Viewdetails function performs to display the parameter value (data) into new small window in that same page when you click view link in that page. The Viewdetails functions gets parameter value when the site loaded initially. I want to scrape the parameter value from site.

I think, there is no ajax call. Viewdetails function performs to display the parameter value (data) into new small window in that same page when you click view link in that page. The Viewdetails functions gets parameter value when the site loaded initially. I want to scrape the parameter value from site.

Instead of guessing what might happen can you just post the function? I can guess at an answer or blindly suggest all sorts of things that may not help at all.

I am able to extract the contents "Hai" and "Hello". But i am unable to extract the contents "yyyyyy" ,"3435534". Because the contents are present within html tag. Currently i am using Web harvesting API for extracting the contents from website. This API gives the result after filtering the html elements. So that i am unable to extract html attributes value.

If you see the actual values in page source and details are always same in number (as it seems to me), then why don't you just try regex?

If any one wants to extract data from web then they can use web data extraction tools which available (Free/Paid) on the internet.

Yes, this tool extracts data in html forms (Not sure about dynamic). I will give you one example.

If you have online market store and if you want to compare your product price with any other online store then you can use this kind of tool. You just need to run this tool and add URL which you want to add then it will give you whole business data in proper structure.

So these tools are very useful for your business intelligence solution.
If readers of this thread have any query feel free to ask.