android - How to parse HTML table using jsoup? -
i trying parse html using jsoup. first time working jsoup , being little hard me. html table trying parse below. html table complicated because of many tr , td , don't know how proceed select name of each column in table 1: "group block" (table 0 topline , don't need it).
i need select "bdd, bbgen, bbtest, conn, cpu, disk, files, hobbitd, http, info, memory, msgs, ports, procs, trends" set them in textview tag in xml file. possible using jsoup?
i have i'm doing conexión url follows:
string username = "user"; string password = "pass"; string login = username + ":" + password; string base64login = new string(android.util.base64.encode(login.getbytes(), android.util.base64.no_wrap)); document document = jsoup.connect("http://example.com").header("authorization", "basic " + base64login).get(); html code:
<table summary="topline" width="100%"> <tr><td height=16> </td></tr>  <!-- menu bar --> <tr> <td valign=middle align=left width="30%"> <font face="arial, helvetica" size="+1" color="silver"><b>xymon</b></font </td> <td valign=middle align=center width="40%"> <center><font face="arial, helvetica" size="+1" color="silver"><b>current status</b></font></center> </td> <td valign=middle align=right width="30%"> <font face="arial, helvetica" size="+1" color="silver"><b>thu jul 23 16:05:06 2015</b></font> </td> </tr> <tr> <td colspan=3> <hr width="100%"> </td> </tr> </table> <br> <a name=hosts-blk> </a>   <center><table summary="group block" border=0 cellpadding=2> <tr><td valign=middle rowspan=2><center><font color="#fffff0" size="+1"> </font></center></td> <td align=center valign=bottom width=45>  <a href="/hobbit-cgi/hobbitcolumn.sh?bbd"><font color="#87a9e5" size="-1"><b>bbd</b></font></a> </td> <td align=center valign=bottom width=45> <a href="/hobbit-cgi/hobbitcolumn.sh?bbgen"><font color="#87a9e5" size="-1"><b>bbgen</b></font></a> </td> <td align=center valign=bottom width=45> <a href="/hobbit-cgi/hobbitcolumn.sh?bbtest"><font color="#87a9e5" size="-1"><b>bbtest</b></font></a> </td> <td align=center valign=bottom width=45> <a href="/hobbit-cgi/hobbitcolumn.sh?conn"><font color="#87a9e5" size="-1"><b>conn</b></font></a> </td> <td align=center valign=bottom width=45> <a href="/hobbit-cgi/hobbitcolumn.sh?cpu"><font color="#87a9e5" size="-1"><b>cpu</b></font></a> </td> <td align=center valign=bottom width=45> <a href="/hobbit-cgi/hobbitcolumn.sh?disk"><font color="#87a9e5" size="-1"><b>disk</b></font></a> </td> <td align=center valign=bottom width=45> <a href="/hobbit-cgi/hobbitcolumn.sh?files"><font color="#87a9e5" size="-1"><b>files</b></font></a> </td> <td align=center valign=bottom width=45> <a href="/hobbit-cgi/hobbitcolumn.sh?hobbitd"><font color="#87a9e5" size="-1"><b>hobbitd</b></font></a> </td> <td align=center valign=bottom width=45> <a href="/hobbit-cgi/hobbitcolumn.sh?http"><font color="#87a9e5" size="-1"><b>http</b></font></a> </td> <td align=center valign=bottom width=45> <a href="/hobbit-cgi/hobbitcolumn.sh?info"><font color="#87a9e5" size="-1"><b>info</b></font></a> </td> <td align=center valign=bottom width=45> <a href="/hobbit-cgi/hobbitcolumn.sh?memory"><font color="#87a9e5" size="-1"><b>memory</b></font></a> </td> <td align=center valign=bottom width=45> <a href="/hobbit-cgi/hobbitcolumn.sh?msgs"><font color="#87a9e5" size="-1"><b>msgs</b></font></a> </td> <td align=center valign=bottom width=45> <a href="/hobbit-cgi/hobbitcolumn.sh?ports"><font color="#87a9e5" size="-1"><b>ports</b></font></a> </td> <td align=center valign=bottom width=45> <a href="/hobbit-cgi/hobbitcolumn.sh?procs"><font color="#87a9e5" size="-1"><b>procs</b></font></a> </td> <td align=center valign=bottom width=45> <a href="/hobbit-cgi/hobbitcolumn.sh?trends"><font color="#87a9e5" size="-1"><b>trends</b></font></a> </td> </tr>  <tr><td colspan=15><hr width="100%"></td></tr> edit:
i tried doesn't work:
arraylist<string> groupblock = new arraylist<string>(); object[] objplace; element table = document.select("table").get(1); //select second table:     "group block" elements rows = table.select("tr");              (int = 0; < rows.size(); i++) {     element row = rows.get(i);     elements col = row.select("td");     if (col.get(1).text().equals("bbd")) { //check 1 field moment         groupblock.add(col.get(1).text());       } } objplace = groupblock.toarray(); then do:
textview txtgroupblock = (textview) findviewbyid(r.id.txtgroupblock); txtgroupblock.settext(""); (int = 0; < objplace.length; i++) { txtgroupblock.append(objplace[i].tostring() + " "); } the error:
07-23 21:26:36.454: e/androidruntime(330): fatal exception: asynctask #1 07-23 21:26:36.454: e/androidruntime(330): java.lang.runtimeexception: error occured while executing doinbackground() 07-23 21:26:36.454: e/androidruntime(330):  @ android.os.asynctask$3.done(asynctask.java:200) 07-23 21:26:36.454: e/androidruntime(330):  @ java.util.concurrent.futuretask$sync.innersetexception(futuretask.java:274) 07-23 21:26:36.454: e/androidruntime(330):  @ java.util.concurrent.futuretask.setexception(futuretask.java:125) 07-23 21:26:36.454: e/androidruntime(330):  @ java.util.concurrent.futuretask$sync.innerrun(futuretask.java:308) 07-23 21:26:36.454: e/androidruntime(330):  @ java.util.concurrent.futuretask.run(futuretask.java:138) 07-23 21:26:36.454: e/androidruntime(330):  @ java.util.concurrent.threadpoolexecutor.runworker(threadpoolexecutor.java:1088) 07-23 21:26:36.454: e/androidruntime(330):  @ java.util.concurrent.threadpoolexecutor$worker.run(threadpoolexecutor.java:581) 07-23 21:26:36.454: e/androidruntime(330):  @ java.lang.thread.run(thread.java:1019) 07-23 21:26:36.454: e/androidruntime(330): caused by: java.lang.indexoutofboundsexception: invalid index 1, size 1 07-23 21:26:36.454: e/androidruntime(330):  @ java.util.arraylist.throwindexoutofboundsexception(arraylist.java:257) 07-23 21:26:36.454: e/androidruntime(330):  @ java.util.arraylist.get(arraylist.java:311) 07-23 21:26:36.454: e/androidruntime(330):  @ org.jsoup.select.elements.get(elements.java:544) 07-23 21:26:36.454: e/androidruntime(330):  @ activities.monitorapp.mainactivity$update.doinbackground(mainactivity.java:211) 07-23 21:26:36.454: e/androidruntime(330):  @ activities.monitorapp.mainactivity$update.doinbackground(mainactivity.java:1) 07-23 21:26:36.454: e/androidruntime(330):  @ android.os.asynctask$2.call(asynctask.java:185) 07-23 21:26:36.454: e/androidruntime(330):  @ java.util.concurrent.futuretask$sync.innerrun(futuretask.java:306) edit 2:
now have parallel problem. have before have following html code (just follows previous html code, same html file):
... <td align=center valign=bottom width=45> <a href="/hobbit-cgi/hobbitcolumn.sh?procs"><font color="#87a9e5" size="-1"><b>procs</b></font></a> </td> <td align=center valign=bottom width=45> <a href="/hobbit-cgi/hobbitcolumn.sh?trends"><font color="#87a9e5" size="-1"><b>trends</b></font></a> </td> </tr>  <tr><td colspan=15><hr width="100%"></td></tr>  <tr class=line> <td nowrap><a name="hostname1"> </a> <font size="+1" color="#ffffcc" face="tahoma, arial, helvetica"><span title="127.0.0.1">hostname1</span></font><td align=center><a href="/hobbit-cgi/bb-hostsvc.sh?host=hostname1.&service=bbd"><img src="/hobbit/gifs/static/green.gif" alt="bbd:green:268d04h25m" title="bbd:green:268d04h25m" height="16" width="16" border=0></a></td> <td align=center><a href="/hobbit-cgi/bb-hostsvc.sh?host=hostname1&service=bbgen"><img src="/hobbit/gifs/static/green.gif" alt="bbgen:green:268d04h24m" title="bbgen:green:268d04h24m" height="16" width="16" border=0></a></td> <td align=center><a href="/hobbit-cgi/bb-hostsvc.sh?host=hostname1&service=bbtest"><img src="/hobbit/gifs/static/green.gif" alt="bbtest:green:268d04h25m" title="bbtest:green:268d04h25m" height="16" width="16" border=0></a></td> <td align=center><a href="/hobbit-cgi/bb-hostsvc.sh?host=hostname1&service=conn"><img src="/hobbit/gifs/static/green.gif" alt="conn:green:268d04h25m" title="conn:green:268d04h25m" height="16" width="16" border=0></a></td> <td align=center><a href="/hobbit-cgi/bb-hostsvc.sh?host=hostname1&service=cpu"><img src="/hobbit/gifs/static/green.gif" alt="cpu:green:169d00h15m" title="cpu:green:169d00h15m" height="16" width="16" border=0></a></td> <td align=center><a href="/hobbit-cgi/bb-hostsvc.sh?host=hostname1&service=disk"><img src="/hobbit/gifs/static/green.gif" alt="disk:green:268d04h25m" title="disk:green:268d04h25m" height="16" width="16" border=0></a></td> <td align=center><a href="/hobbit-cgi/bb-hostsvc.sh?host=hostname1&service=files"><img src="/hobbit/gifs/static/clear.gif" alt="files:clear:268d04h25m" title="files:clear:268d04h25m" height="16" width="16" border=0></a></td> <td align=center><a href="/hobbit-cgi/bb-hostsvc.sh?host=hostname1&service=hobbitd"><img src="/hobbit/gifs/static/green.gif" alt="hobbitd:green:169d01h05m" title="hobbitd:green:169d01h05m" height="16" width="16" border=0></a></td> <td align=center><a href="/hobbit-cgi/bb-hostsvc.sh?host=hostname1&service=http"><img src="/hobbit/gifs/static/green.gif" alt="http:green:268d04h19m" title="http:green:268d04h19m" height="16" width="16" border=0></a></td> <td align=center><a href="/hobbit-cgi/bb-hostsvc.sh?host=hostname1&service=info"><img src="/hobbit/gifs/static/green.gif" alt="info:green:127.0.0.1" title="info:green:127.0.0.1" height="16" width="16" border=0></a></td> <td align=center><a href="/hobbit-cgi/bb-hostsvc.sh?host=hostname1&service=memory"><img src="/hobbit/gifs/static/green.gif" alt="memory:green:268d04h25m" title="memory:green:268d04h25m" height="16" width="16" border=0></a></td> <td align=center><a href="/hobbit-cgi/bb-hostsvc.sh?host=hostname1&service=msgs"><img src="/hobbit/gifs/static/green.gif" alt="msgs:green:268d04h20m" title="msgs:green:268d04h20m" height="16" width="16" border=0></a></td> <td align=center><a href="/hobbit-cgi/bb-hostsvc.sh?host=hostname1&service=ports"><img src="/hobbit/gifs/static/clear.gif" alt="ports:clear:268d04h25m" title="ports:clear:268d04h25m" height="16" width="16" border=0></a></td> <td align=center><a href="/hobbit-cgi/bb-hostsvc.sh?host=hostname1&service=procs"><img src="/hobbit/gifs/static/clear.gif" alt="procs:clear:268d04h25m" title="procs:clear:268d04h25m" height="16" width="16" border=0></a></td> <td align=center><a href="/hobbit-cgi/bb-hostsvc.sh?host=hostname1&service=trends"><img src="/hobbit/gifs/static/green.gif" alt="trends:green:" title="trends:green:" height="16" width="16" border=0></a></td> </tr>  <tr class=line> <td nowrap><a name="hostname2"> </a> <font size="+1" color="#ffffcc" face="tahoma, arial, helvetica"><span title="127.0.0.2">hostname2</span></font><td align=center><a href="/hobbit-cgi/bb-hostsvc.sh?host=hostname2&service=bbd"><img src="/hobbit/gifs/static/red.gif" alt="bbd:red:16d06h46m" title="bbd:red:16d06h46m" height="16" width="16" border=0></a></td> <td align=center>-</td> <td align=center>-</td> <td align=center><a href="/hobbit-cgi/bb-hostsvc.sh?host=hostname2&service=conn"><img src="/hobbit/gifs/static/green.gif" alt="conn:green:16d06h46m" title="conn:green:16d06h46m" height="16" width="16" border=0></a></td> <td align=center>-</td> <td align=center>-</td> <td align=center>-</td> <td align=center>-</td> <td align=center><a href="/hobbit-cgi/bb-hostsvc.sh?host=hostname2&service=http"><img src="/hobbit/gifs/static/green.gif" alt="http:green:16d06h46m" title="http:green:16d06h46m" height="16" width="16" border=0></a></td> <td align=center><a href="/hobbit-cgi/bb-hostsvc.sh?host=hostname2&service=info"><img src="/hobbit/gifs/static/green.gif" alt="info:green:127.0.0.2" title="info:green:127.0.0.2" height="16" width="16" border=0></a></td> <td align=center>-</td> <td align=center>-</td> <td align=center>-</td> <td align=center>-</td> <td align=center><a href="/hobbit-cgi/bb-hostsvc.sh?host=hostname2&service=trends"><img src="/hobbit/gifs/static/green.gif" alt="trends:green:" title="trends:green:" height="16" width="16" border=0></a></td> </tr>  </table></center><br> <br><br> in case have parse 2 hostnames (hostname1 , hostname2) put in separates textview problem hostname can change name in future. in addition, have parse "img src" in each td, example:
<td align=center><a href="/hobbit-cgi/bb-hostsvc.sh?host=hostname1&service=http"><img src="/hobbit/gifs/static/green.gif" alt="http:green:268d04h19m" title="http:green:268d04h19m" height="16" width="16" border=0></a></td> i need parse /hobbit/gifs/static/green.gif have append rest of url @ begining: http://example.com/hobbit/gifs/static/green.gif image.
i know once image have like:
inputstream input = new java.net.url(imgsrc).openstream(); bitmap = bitmapfactory.decodestream(input); imageview logoimg = (imageview) findviewbyid(r.id.logo); logoimg.setimagebitmap(bitmap); but miss me in previous steps...some idea? don't know how start...
the problem here
if (col.get(1).text().equals("bbd")) {   groupblock.add(col.get(i).text());   } you try access col.get(i), may out of bounds, error tells also.
if change index want, should fine. maybe this:
arraylist<string> groupblock = new arraylist<string>(); object[] objplace; element table = document.select("table").get(1); //select second table:     "group block" elements rows = table.select("tr");              (int = 0; < rows.size(); i++) {     element row = rows.get(i);     elements cols = row.select("td");     (element col : cols){         switch(col.text()){         case "bbd":          case "bbgen":         case "bbtest":         //...more cases if need them             groupblock.add(col.select("a").first().attr("href"));             system.out.println(col.text());              break;         default:             break;         }     }       } objplace = groupblock.toarray(); i not sure need dom, think idea.
Comments
Post a Comment