Extract records from HTML of another site

From: SteveJ (dafella_at_swbell.net)
Date: 10/31/04

  • Next message: Jurgy: "Mail() does not function on FreeBSD and PHP and Qmail or Postfix"
    Date: Sun, 31 Oct 2004 12:41:55 GMT
    
    

    All,
    Can someone help me solve the next step.

    First of all let me say I'm new to php. I pieced the following code together
    from samples
    I found on the net and a book I bought called PHP Cookbook. So please
    forgive me if this isn't the best approach - I'm open to suggestions
    I finally got my code to work that logs into another site and pulls the
    orderstatus page to my server.

    <?php
    /*
    Login to site
    */
    $ch = curl_init();
    curl_setopt($ch, CURLOPT_COOKIEJAR, "/tmp/cookieFileName");
    curl_setopt($ch,
    CURLOPT_URL,"https://www.homier.com/default.asp?page=signin");
    curl_setopt($ch, CURLOPT_POST, 1);
    curl_setopt($ch, CURLOPT_POSTFIELDS,
    "EMail=homierorders@swbell.net&Password=1040ez");
    ob_start(); // prevent any output
    curl_exec ($ch); // execute the curl command
    ob_end_clean(); // stop preventing output
    curl_close ($ch);
    unset($ch);

    /*
    Dump html of orderstatus page into a file on my server
    */
    $fh = fopen('raw_orderstatus.html','w') or die($php_errormsg);
    $ch = curl_init();
    curl_setopt($ch, CURLOPT_RETURNTRANSFER,1);
    curl_setopt($ch, CURLOPT_COOKIEFILE, "/tmp/cookieFileName");
    curl_setopt($ch,
    CURLOPT_URL,"https://www.homier.com/default.asp?page=orderstatus");
    curl_setopt($ch, CURLOPT_FILE, $fh);
    curl_exec ($ch);
    curl_close ($ch);
    ?>

    My problem: How can I capture only the data in the "<td
    class='n8n_CCCCCC_default>" tags?
    Is there a way to do this at file creation?
    I checked with my ISP and I can't use LYNX -DUMP file.html

    The goal here is to load these records into MYSQL database.

    Thanks in advance
    Steve

    The html code looks like this

    <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
    <html>
    <head>
    <title>homier.com</title>
    <LINK REL='style***' TYPE='text/css' HREF='hdcstyle.css'>
    <meta http-equiv='Content-Type' content='text/html; charset=iso-8859-1'>
    <meta name='description' content=' '>
    <meta name='keywords' content=' '>
    <meta name='revisit-after' content='7 days'>
    <meta name='robots' content='all, index, follow'>
    <meta http-equiv="Pragma" content="no-cache">
    <script language="JavaScript"
    src="https://www.thawte.com/html/certdetails.js"
    type="text/javascript"></script>

    </head>
    <body leftmargin='0' topmargin='0' rightmargin='0' bottommargin='0'
    marginwidth='0' marginheight='0'>
    <table cellspacing="0" cellpadding="0" border="0" width='770'>
    <tr>
     <td align='left' valign='top'>
     <img src='https://www.homier.com/graphics/hdclogo3.jpg' border='0'
    width='370' height='58' alt='Homier Distributing Company, Inc.'></td>

     <td align='right' valign='top'>
      <table cellspacing="0" cellpadding="0" border="0">
      <tr><td align='right' valign='middle' class='menu'>
      <a href='https://www.homier.com/default.asp?page=cart' class='menu'><img
    src='https://www.homier.com/graphics/cart.gif' border='0' width='17'
    height='21' align='absmiddle'>Shopping Cart</a>
    | <a href='https://www.homier.com/default.asp?page=stores' class='menu'>Sale
    Locations</a>
    | <a href='https://www.homier.com/default.asp?page=about' class='menu'>About
    Us</a>
    | <a href='https://www.homier.com/default.asp?page=contacts'
    class='menu'>Contact Us</a>
    | <a href='https://www.homier.com/default.asp?page=faq' class='menu'>FAQ</a>
    </td></tr>
    <tr><td align='right' valign='middle' class='menu'>
    <a href='https://www.homier.com/default.asp?page=myprofile' class='menu'>My
    Account</a>
    | <a href='https://www.homier.com/default.asp?page=orderstatus'
    class='menu'>Order Status</a> |
    <a href='https://www.homier.com/default.asp?page=dealers'
    class='menu'>Dealer Extranet</a>
    </td></tr>

      </table>
     </td>
    </tr>
    </table>
    <table cellspacing="0" cellpadding="0" border="0" width='770'>
    <tr>
     <td colspan='2' valign='top' align='center'
    background='https://www.homier.com/graphics/hdcbk3.jpg'>
    <table cellspacing='0' cellpadding='0' border='0'>
    <tr><table cellspacing='0' cellpadding='0' border='0'><tr>
    <td align='middle' valign='top'><img
    src='https://www.homier.com/graphics/tab_start.gif' border='0' width='5'
    height='21'></td>
    <td align='middle' valign='middle' class='tabs' style='background-image:
    url(https://www.homier.com/graphics/tab_bg.gif);'><a
    href='https://www.homier.com/default.asp?dpt=0' class='link'
    onmouseover="this.style.color='yellow'"
    onmouseout="this.style.color='white'">Home</a></td>
    <td align='middle' valign='top'><img
    src='https://www.homier.com/graphics/tab_ctr.gif' border='0' width='9'
    height='21'></td>
    <td align='middle' valign='middle' class='tabs' style='background-image:
    url(https://www.homier.com/graphics/tab_bg.gif);'><a
    href='https://www.homier.com/default.asp?dpt=1' class='link'
    onmouseover="this.style.color='yellow'"
    onmouseout="this.style.color='white'">Tools</a></td>
    <td align='middle' valign='top'><img
    src='https://www.homier.com/graphics/tab_ctr.gif' border='0' width='9'
    height='21'></td>
    <td align='middle' valign='middle' class='tabs' style='background-image:
    url(https://www.homier.com/graphics/tab_bg.gif);'><a
    href='https://www.homier.com/default.asp?dpt=2' class='link'
    onmouseover="this.style.color='yellow'"
    onmouseout="this.style.color='white'">Automotive</a></td>
    <td align='middle' valign='top'><img
    src='https://www.homier.com/graphics/tab_ctr.gif' border='0' width='9'
    height='21'></td>
    <td align='middle' valign='middle' class='tabs' style='background-image:
    url(https://www.homier.com/graphics/tab_bg.gif);'><a
    href='https://www.homier.com/default.asp?dpt=4' class='link'
    onmouseover="this.style.color='yellow'"
    onmouseout="this.style.color='white'">Electronics</a></td>
    <td align='middle' valign='top'><img
    src='https://www.homier.com/graphics/tab_ctr.gif' border='0' width='9'
    height='21'></td>
    <td align='middle' valign='middle' class='tabs' style='background-image:
    url(https://www.homier.com/graphics/tab_bg.gif);'><a
    href='https://www.homier.com/default.asp?dpt=6' class='link'
    onmouseover="this.style.color='yellow'"
    onmouseout="this.style.color='white'">Collectibles</a></td>
    <td align='middle' valign='top'><img
    src='https://www.homier.com/graphics/tab_end.gif' border='0' width='5'
    height='21'></td>
    </tr>
    </table>
    <table cellspacing='0' cellpadding='0' border='0'><tr>
    <td align='middle' valign='top'><img
    src='https://www.homier.com/graphics/tab_start.gif' border='0' width='5'
    height='21'></td>
    <td align='middle' valign='middle' class='tabs' style='background-image:
    url(https://www.homier.com/graphics/tab_bg.gif);'><a
    href='https://www.homier.com/default.asp?dpt=3' class='link'
    onmouseover="this.style.color='yellow'"
    onmouseout="this.style.color='white'">Outdoor Living</a></td>
    <td align='middle' valign='top' class='tab_ends'><img
    src='https://www.homier.com/graphics/tab_ctr.gif' border='0' width='9'
    height='21'></td>
    <td align='middle' valign='middle' class='tabs' style='background-image:
    url(https://www.homier.com/graphics/tab_bg.gif);'><a
    href='https://www.homier.com/default.asp?dpt=5' class='link'
    onmouseover="this.style.color='yellow'"
    onmouseout="this.style.color='white'">Home Furnishings</a></td>
    <td align='middle' valign='top' class='tab_ends'><img
    src='https://www.homier.com/graphics/tab_ctr.gif' border='0' width='9'
    height='21'></td>
    <td align='middle' valign='middle' class='tabs' style='background-image:
    url(https://www.homier.com/graphics/tab_bg.gif);'><a
    href='https://www.homier.com/default.asp?dpt=7' class='link'
    onmouseover="this.style.color='yellow'"
    onmouseout="this.style.color='white'">General Merchandise</a></td>
    <td align='middle' valign='top' class='tab_ends'><img
    src='https://www.homier.com/graphics/tab_ctr.gif' border='0' width='9'
    height='21'></td>
    <td align='middle' valign='middle' class='tabs' style='background-image:
    url(https://www.homier.com/graphics/tab_bg.gif);'><a
    href='https://www.homier.com/default.asp?dpt=99' class='link'
    onmouseover="this.style.color='yellow'"
    onmouseout="this.style.color='white'">See All</a></td>
    <td align='middle' valign='top'><img
    src='https://www.homier.com/graphics/tab_end.gif' border='0' width='5'
    height='21'></td>
    </tr>
    </table>
    </td>
    </tr>
    </table>
    <table cellspacing='0' cellpadding='0' border='0' width='770'>
    <form action='https://www.homier.com/default.asp' method='post'>
    <tr>
    <td class='tab_background' height='28' valign='middle' align='left'
    width='500' nowrap>
    <input type='hidden' name='page' value='search'>
    <input type='hidden' name='pgndx' value='1'>
    &nbsp;&nbsp;Search in:
    <select name='SearchIn' class='search'>
    <option value='0'
     SELECTED
    >All Departments</option>
    <option value='99'
    >Catalog Number</option>
    <option value='1'
    >Tools</option>
    <option value='2'
    >Automotive</option>
    <option value='3'
    >Outdoor Living</option>
    <option value='4'
    >Electronics</option>
    <option value='5'
    >Home Furnishings</option>
    <option value='6'
    >Collectibles</option>
    <option value='7'
    >General Merchandise</option>
    </select>
    &nbsp;for:
    <input type='text' name='SearchFor' size='15' maxlength='30' class='search'
    value=''>
    <input type='image' src='https://www.homier.com/graphics/go.gif' border='0'
    align='absmiddle' alt='Click to search'>
    </td>
    <td align='left' class='b8n_white_000f46' width='100%'
    nowrap>800-348-5004</td>
    <td align='right' class='b8n_white_000f46' width='100' nowrap>
    <a href='https://www.homier.com/default.asp?page=logout'
    class='service'><img src='https://www.homier.com/graphics/lock.gif'
    width='11' height='15' border='0' align='absmiddle'>&nbsp;Sign Out&nbsp;
    </a></td>
    </tr>
    </form>
    </table>
    <table cellspacing='0' cellpadding='0' border='0' width='770'>
    <tr><td valign='top' align='center'>
    <table cellpadding='0' cellspacing='0' border='0' width='750'>
    <tr><td class='e16n_000f46_default'>Order Status & Tracking</td></tr>
    <tr><td align='center'><img src='https://www.homier.com/graphics/grey.gif'
    border='0' width='750' height='1'></td></tr>
    <tr><td class='b9in_default_default' align='right'>Orders 8/1/2004 -
    10/30/2004</td></tr>
    <tr><td>&nbsp;</td></tr>
    </table>
    <table cellpadding='2' cellspacing='0' border='0' width='750'>
    <tr>
    <td class='b8n_default_default'>Order #</td>
    <td class='b8n_default_default'>Ref</td>
    <td class='b8n_default_default'>Order Date</td>
    <td class='b8n_default_default'>Shipped To</td>
    <td class='b8n_default_default'>Status</td>
    <td class='b8n_default_default'>Tracking</td>
    </tr>
    <tr>
    <td class='n8n_CCCCCC_default'><a
    href='https://www.homier.com/default.asp?page=orderdetail&orderid=307377'>16
    0710SE</a></td>
    <td class='n8n_CCCCCC_default'>307377</td>
    <td class='n8n_CCCCCC_default'>10/29/2004</td>
    <td class='n8n_CCCCCC_default'>Stan Johnson Blue Springs, MO</td>
    <td class='n8n_CCCCCC_default'>AR Processing</td>
    <td class='n8n_CCCCCC_default'><a
    href='https://www.homier.com/default.asp?page=tracking&trackingnumber='></a>
    </td>
    </tr>
    </table>

     </td>
    </tr>
    <tr><td colspan='2' align='center'><table cellpadding='0' cellspacing='0'
    border='0'>
    <tr><td align='center'>&nbsp;</td></tr>
    <tr><td align='center'><img src='https://www.homier.com/graphics/grey.gif'
    border='0' width='350' height='1'></td></tr>
    <tr><td align='center'>&nbsp;</td></tr>
    <tr><td align='center' class='n7n_default_default'>
    <a href='https://www.homier.com/default.asp?page=privacy'
    class='menu'>Privacy & Security</a>
    | <a href='https://www.homier.com/default.asp?page=terms' class='menu'>Terms
    of Use</a>
    | <a href='https://www.homier.com/default.asp?page=pressreleases'
    class='menu'>Press Releases</a>
    </td></tr>
    <tr><td align='center' class='n7n_default_default'>
    <a href='https://www.homier.com/default.asp?page=sitemap' class='menu'>Site
    Map</a>
    | <a href='https://www.homier.com/default.asp?page=warranty'
    class='menu'>Warranty & Returns</a>
    | <a href='https://www.homier.com/default.asp?page=shipping'
    class='menu'>Shipping Policy</a>
    </td></tr>
    <tr><td>&nbsp;</td></tr>
    <tr><td align='center' class='copyright'><a
    href='https://www.homier.com/default.asp?page=copyright'>Copyright</a>&nbsp;
    &copy;2004, Homier Distributing Company. All rights reserved.</td></tr>
    </table>
    </td></tr>
    </table>

    </body>
    </html>


  • Next message: Jurgy: "Mail() does not function on FreeBSD and PHP and Qmail or Postfix"
  • Quantcast