Hpricot is a HTML parser, fantastic ruby library, easy to install and easy usage
To install
sudo gem install hpricot open-uri
open-uri is using a network streams
here i posted a simple web scraping code
This code to fetch the group of student results from the Annauniversity website
13 | exam_no = "52108621001".."52108621039" |
15 | exam_no.each do |each_number| |
16 | doc=Hpricot(open(url+each_number)) |
17 | data=doc.search('table') |
19 | File.open("result.html","a") {|f| f.puts(data)} |
21 | x=doc.search('table').inner_html |
23 | a=x.gsub(/<\/?[^>]*>/,"") |
26 | puts b+"\n"+"=======================" |
28 | File.open("result.txt","a") { |f| f.puts(b+"\n\n"+"=================")} |