I want to know how to check a hyperlink text and the link exists in an URL or not using c# -


i want create tool me check whether link , text exists in webpage or not using c#...also want know if link dofollow or nofollow. example here link of wiki site https://en.wikipedia.org/wiki/main_page

you can see lot of link text in article body arts,history etc .i want check if "arts" link text exists or not correct hyperlink(https://en.wikipedia.org/wiki/portal:arts) if nofollow or dofollow. going create tool please me.

the main idea of project monitor link text , link in article online ,whether or not if exists or deleted someone.also know if dofollow or nofollow.

i'm not sure mean nofollow or dofollow, here quick code sample iterate through links on web page. should starting point going. if you're looking little more robust can in html agility pack. it's bit more complicated use dom view of web page.

since using webbrowser control insure include system.windows.forms reference. if must use project type not allow forms object, can done webclient or webrequest, both of take bit more effort working.

static void checklink() {     webbrowser wb = new webbrowser();      wb.documentcompleted += wb_documentcompleted;     //pass firefox user agnent prevent getting mobile site     wb.navigate("https://en.wikipedia.org/wiki/main_page",null ,null,"user-agent: mozilla/5.0 (windows nt 6.1; wow64; rv:39.0) gecko/20100101 firefox/39.0"); }  static void wb_documentcompleted(object sender, webbrowserdocumentcompletedeventargs e) {     webbrowser wb = (webbrowser)sender;     string html = wb.documenttext;      //find body tag     int bodyposition = html.toupper().indexof("<body");      int position = html.toupper().indexof("<a href", bodyposition);     while (position  > -1) {          int beginurlposition = html.indexof("\"", position);         int endurlposition = html.indexof("\"", beginurlposition + 1);         string link = html.substring(beginurlposition+1, endurlposition - beginurlposition -1);          //do somthing found link          //look next link         position = html.toupper().indexof("<a href", endurlposition);     } } 

Comments

Popular posts from this blog

python - pip install -U PySide error -

arrays - C++ error: a brace-enclosed initializer is not allowed here before ‘{’ token -

apache - setting document root in antoher partition on ubuntu -