User Tools

Site Tools

study:anglesharp:20250309-001:index

黑貓-貨態追蹤 (2025-03-09)

Local Backup

黑貓貨態查詢網址

觀察HTML內文中的特徵

  • 觀察到 Table 的 Class 為 .tablelist
  • td 的 Class 為 .style1
  • <!-------------------------------------contentContainer----------------------------------------->
    <div id="contentContainer">
    <!-------------------------------------contentContainer aside--------------------------------->
    <div id="aside">
        <!-------------------------------------contentContainer main---------------------------------->
        <div id="main">
          <!--contentsArea-------------------------------------------------->
          <div class="contentsArea">
            <div class="contentsOne">
              <h2 class="typeA">一般包裹查詢</h2>
              <div class="contentsBtm">
                <div class="contentsInner">
                  <div class="articleTypeA">
                    <p class="paddingR18L10">您所輸入的包裹查詢號碼以及查詢結果如下:   </p>
                    <!--<p class="paddingR18L10">點選包裹查詢號碼,可以查詢包裹的歷史狀態;點選營業所可查詢營業所聯絡方式 </p>-->
                    <p class="paddingR18L10">&nbsp;</p>
                    <table cellpadding="0" cellspacing="0" class="tablelist">
                      <tbody>
    				  <tr class="top">
                        <td height="38">包裹查詢號碼</td>
                        <td>目前狀態</td>
                        <td>資料登入時間</td>
                        <td>負責營業所</td>
                        <!--
                        <td>配送人員</td>
                        -->
                      </tr>
                       <tr valign="center" align="middle" bgcolor="#ffffff">    
                            <td height="44" rowspan="5"><span class="bl12">906999479515</span></td>
                            <td class="style1" bgcolor="yellow" title="包裹已經送達收件人">        <span class="r2"><strong>順利送達</strong></span>    </td>
                            <td class="style1" bgcolor="yellow">        <div align="center">            <span class="bl12">2021/09/07 <br>12:30</span></div>    </td>
                            <td class="style1" bgcolor="yellow">        <span class="bl12">逢甲營業所</span></td>
                        </tr>
                        <tr valign="center" align="middle" bgcolor="#cef4f5">    
                            <td class="style1" title="SD正在將包裹配送到收件人途中">        <span class="bl12">配送中</span>    </td>
                            <td class="style1">        <div align="center">            <span class="bl12">2021/09/07 <br>12:24</span></div>    </td>
                            <td class="style1">        <span class="bl12">逢甲營業所</span></td>
                        </tr>
                        <tr valign="center" align="middle" bgcolor="#ffffff">    
                            <td class="style1" title="SD正在將包裹配送到收件人途中">        <span class="bl12">配送中</span>    </td>
                            <td class="style1">        <div align="center">            <span class="bl12">2021/09/07 <br>05:57</span></div>    </td>
                            <td class="style1">        <span class="bl12">逢甲營業所</span></td>    
                        </tr>
                        <tr valign="center" align="middle" bgcolor="#cef4f5">    
                            <td class="style1" title="包裹正從營業所送到轉運中心,或從轉運中心送到營業所">        <span class="bl12">轉運中</span>    </td>
                            <td class="style1">        <div align="center">            <span class="bl12">2021/09/07 <br>00:02</span></div>    </td>
                            <td class="style1">        <span class="bl12"><a class="text4" href="Foothold_Detail.aspx?ID=500">中區轉運 中心(區)</a></span></td>
                        </tr>
                        <tr valign="center" align="middle" bgcolor="#ffffff">    
                            <td class="style1" title="SD已經至寄件人指定地點收到包裹">        <span class="bl12">已集貨</span>    </td>    
                            <td class="style1">        <div align="center">            <span class="bl12">2021/09/06 <br>18:54</span></div>    </td>    
                            <td class="style1">        <span class="bl12">北二特販二所</span></td>    
                        </tr>
                    </tbody></table>
        ... ... (以下略)
    </div>

AngleSharp 介紹

如何取得網頁上的資料?

方法 1 - 自幹 HttpClient 取得 HttpResponseMessage

  • /// <summary>
    /// 黑貓物件
    /// </summary>
    public class TCatService
    {
        /// <summary>
        /// Http Client 應該為單一實例,使用 Dispose 的話會導致每次請求開啟過多的Socket造成資源耗用
        /// </summary>
        private static HttpClient _client;
    
        /// <summary>
        /// 產生 TCat 物件
        /// </summary>
        static TCatService()
        {
            // 產生 HttpClient 的實例
            CreateInstance();
        }
    
        /// <summary>
        /// 取得 HttpClient
        /// </summary>
        public HttpClient HttpClient => _client;
    
        /// <summary>
        /// 黑貓的Domain
        /// </summary>
        private static string RootUri => "https://www.t-cat.com.tw/";
    
        /// <summary>
        /// 取得 貨態查詢 的Response
        /// </summary>
        /// <param name="waybillID"></param>
        /// <returns></returns>
        public async Task<HttpResponseMessage> GetTraceDetail_Response(string waybillID)
        {
            // 去除空白
            waybillID = waybillID?.Trim();
    
            // 發送請求
            //=========================================
            // API(貨態查詢) => Inquire/TraceDetail.aspx
            // -
            // 參數: BillID
            // ========================================
            var requestUri = $"Inquire/TraceDetail.aspx?BillID={waybillID}";
    
            return await _client.GetAsync(requestUri);
        }
    
        /// <summary>
        /// 產生 HttpClient 的實例
        /// </summary>
        private static void CreateInstance()
        {
            // 指定Domain
            var baseUri = new Uri(RootUri);
    
            // 產生 Instance
            _client = new HttpClient();
            // 設定 RootAddress
            _client.BaseAddress = baseUri;
    
            // 設定 1 分鐘沒有活動,則關閉連線,預設為 -1 (永不關閉)
            var sp = ServicePointManager.FindServicePoint(baseUri);
            sp.ConnectionLeaseTimeout = (int)TimeSpan.FromMinutes(1).TotalMilliseconds;
    
            // 設定 2 分鐘自動更新DNS,預設為 12000 (2 分鐘)
            ServicePointManager.DnsRefreshTimeout = (int)TimeSpan.FromMinutes(2).TotalSeconds;
        }
    }

方法 2 - 使用 AngleSharp 的 Context.OpenAsync() 取得 IDocument

  • /// <summary>
    /// 取得 貨態 => 文字內容
    /// </summary>
    /// <param name="waybillId"></param>
    /// <returns></returns>
    public async Task<string> GetTraceDetail_Result(string waybillId)
    {
        // 回傳字串
        var result = new StringBuilder();
    
        // Trim掉空白
        waybillId = waybillId.Trim();
    
        // Request
        var address = $"https://www.t-cat.com.tw/Inquire/TraceDetail.aspx?BillID={waybillId}";
    
        #region 載入 AngleSharp 設定
    
        //Use the default configuration for AngleSharp (With DefaultLoader)
        var config = Configuration.Default.WithDefaultLoader();
    
        //Create a new context for evaluating webpages with the given config
        var context = BrowsingContext.New(config);
    
        //Get a HtmlParser
        var htmlParser = context.GetService<IHtmlParser>();
    
        #endregion 載入 AngleSharp 設定
    
        //Create a virtual request to specify the document to load (here from our fixed string)
        var document = await context.OpenAsync(address);
    
        ... ... (以下略)
    }

取得 TraceDetails 物件

方法 1 - 透過 System.Net.Http.HttpResponseMessage 取得

  • /// <summary>
    /// 取得 貨態 => 物件清單
    /// </summary>
    /// <param name="response"></param>
    /// <returns></returns>
    public async Task<List<Cat_TraceDetails>> GetTraceDetail(HttpResponseMessage response)
    {
        // 回傳值
        var result = new List<Cat_TraceDetails>();
    
        // 檢查 Response
        response.EnsureSuccessStatusCode();
    
        // 取得 waybillId
        // 透過 原始RequestMessage 的 Uri,透過 參數 (BillID) 拆出 waybillId
        var separator = new string[] { "BillID=" };
        var waybillId = response.RequestMessage
            .RequestUri
            .AbsoluteUri.Split(separator, StringSplitOptions.RemoveEmptyEntries)[1];
    
        // 取得 Response 的值
        var res_Content = await response.Content.ReadAsStringAsync();
    
        #region 載入 AngleSharp 設定
    
        //Use the default configuration for AngleSharp (With DefaultLoader)
        var config = Configuration.Default.WithDefaultLoader();
    
        //Create a new context for evaluating webpages with the given config
        var context = BrowsingContext.New(config);
    
        //Get a HtmlParser
        var htmlParser = context.GetService<IHtmlParser>();
    
        #endregion 載入 AngleSharp 設定
    
        //Create a virtual request to specify the document to load (here from our fixed string)
        var document = await context.OpenAsync(x => x.Content(res_Content));
    
        // 取得 ContentsArea 底下的 tablelist
        var table = document.QuerySelector(".tablelist");
    
        // 取得 tr 清單
        var tr_list = table.QuerySelectorAll("tr");
    
        // 只有取回標題列 => Data Not Found!
        if (tr_list.Length <= 1)
        {
            throw new Exception($"託運單號({waybillId}) 查無貨態追蹤紀錄!!");
        }
    
        // 取得內容欄位 (跳過標題列)
        var tr_body = tr_list.Skip(1);
    
        // 取得 waybillId => tr > td > .bl12
        var id = tr_body
            .FirstOrDefault()
            .QuerySelector("td")
            .QuerySelector(".bl12")
            .TextContent
            .Trim();
    
        // 請求與取回的資料不同!!
        if (!waybillId.Equals(id))
        {
            // 拋出錯誤
            throw new Exception($"請求({waybillId}) 與 取回的資料({id}) 不同!!");
        }
    
        // 從 tr_body 中取回資料
        result = tr_body
            .Select(tr =>
            {
                // 取得 td_list 中 class 為 style1 的欄位
                var td_list = tr.QuerySelectorAll("td")
                                .Where(x => (x.ClassName ?? "").Equals("style1"))
                                .ToList();
    
                // 將每一個 Row 映射為 TraceDetails
                return new Cat_TraceDetails
                {
                    WaybillId = id,
                    GoodStatus = td_list[0].TextContent.Trim(),
                    ReceiveTime = Convert.ToDateTime(td_list[1].TextContent),
                    Station = td_list[2].TextContent.Trim()
                };
            }).ToList();
    
        // 回傳資料
        return result;
    }

方法 2 - 透過 AngleSharp.Dom.IDocument 取得

  • /// <summary>
    /// 取得 貨態 => 物件清單
    /// </summary>
    /// <param name="waybillId"></param>
    /// <returns></returns>
    public async Task<List<Cat_TraceDetails>> GetTraceDetail(string waybillId)
    {
        // 回傳值
        var result = new List<Cat_TraceDetails>();
    
        // Trim掉空白
        waybillId = waybillId.Trim();
    
        // 請求字串
        var address = $"https://www.t-cat.com.tw/Inquire/TraceDetail.aspx?BillID={waybillId}";
    
        #region 載入 AngleSharp 設定
    
        //Use the default configuration for AngleSharp (With DefaultLoader)
        var config = Configuration.Default.WithDefaultLoader();
    
        //Create a new context for evaluating webpages with the given config
        var context = BrowsingContext.New(config);
    
        //Get a HtmlParser
        var htmlParser = context.GetService<IHtmlParser>();
    
        #endregion 載入 AngleSharp 設定
    
        //Create a virtual request to specify the document to load (here from our fixed string)
        var document = await context.OpenAsync(address);
    
        // 取得 ContentsArea 底下的 tablelist
        var table = document.QuerySelector(".tablelist");
    
        // 取得 tr 清單
        var tr_list = table.QuerySelectorAll("tr");
    
        // 只有取回標題列 => Data Not Found!
        if (tr_list.Length <= 1)
        {
            throw new Exception($"託運單號({waybillId}) 查無貨態追蹤紀錄!!");
        }
    
        // 取得內容欄位 (跳過標題列)
        var tr_body = tr_list.Skip(1);
    
        // 取得 waybillId => tr > td > .bl12
        var id = tr_body
            .FirstOrDefault()
            .QuerySelector("td")
            .QuerySelector(".bl12")
            .TextContent
            .Trim();
    
        // 請求與取回的資料不同!!
        if (!waybillId.Equals(id))
        {
            // 拋出錯誤
            throw new Exception($"請求({waybillId}) 與 取回的資料({id}) 不同!!");
        }
    
        // 從 tr_body 中取回資料
        result = tr_body
            .Select(tr =>
            {
                // 取得 td_list 中 class 為 style1 的欄位
                var td_list = tr.QuerySelectorAll("td")
                                .Where(x => (x.ClassName ?? "").Equals("style1"))
                                .ToList();
    
                // 將每一個 Row 映射為 TraceDetails
                return new Cat_TraceDetails
                {
                    WaybillId = id,
                    GoodStatus = td_list[0].TextContent.Trim(),
                    ReceiveTime = Convert.ToDateTime(td_list[1].TextContent),
                    Station = td_list[2].TextContent.Trim()
                };
            }).ToList();
    
        // 回傳資料
        return result;
    }

如何使用?

  • static void Main(string[] args)
    {
        // 產生 T-Cat 物件
        var cat = new TCatService();
    
        // 託運單號
        var waybillId = "906999479515";
    
        try
        {
            #region 方法1 => 自幹 HttpClient
            // 取得 Response
            var response = cat.GetTraceDetail_Response(waybillId).Result;
            // 取得貨態,透過 Response
            var traceDetails = cat.GetTraceDetail(response).Result;
    
            #endregion
    
            #region 方法2 => 使用 AngleSharp 的 Context 的 OpenAsync 取得上下文(IDocument)
            //// 取得 貨態紀錄(文字檔)
            //var text = cat.GetTraceDetail_Result(waybillId).Result;
            //Console.WriteLine(text);
            //// 取得 貨態紀錄
            //var traceDetails = cat.GetTraceDetail(waybillId).Result;
    
            #endregion
    
    
            // 序列化物件
            var json = JsonConvert.SerializeObject(traceDetails, Formatting.Indented);
            // Print
            Console.WriteLine(json);
        }
        catch (Exception ex)
        {
            Console.WriteLine("Error Occor:");
            Console.WriteLine(ex.Message);
        }
        finally
        {
            Console.ReadKey();
        }
    }

效果圖

  • URL: https://www.t-cat.com.tw/Inquire/TraceDetail.aspx?BillID=906999479515
    ------------------------------------------------------------
    包裹查詢號碼    目前狀態        資料登入時間    負責營業所
    ------------------------------------------------------------
    906999479515    順利送達    2021/09/07 12:30    逢甲營業所
    906999479515    配送中    2021/09/07 12:24    逢甲營業所
    906999479515    配送中    2021/09/07 05:57    逢甲營業所
    906999479515    轉運中    2021/09/07 00:02    中區轉運中心(區)
    906999479515    已集貨    2021/09/06 18:54    北二特販二所
    
    JSON
    ------------------------------------------------------------
    [
      {
        "WaybillId": "906999479515",
        "GoodStatus": "順利送達",
        "ReceiveTime": "2021-09-07T12:30:00",
        "Station": "逢甲營業所"
      },
      {
        "WaybillId": "906999479515",
        "GoodStatus": "配送中",
        "ReceiveTime": "2021-09-07T12:24:00",
        "Station": "逢甲營業所"
      },
      {
        "WaybillId": "906999479515",
        "GoodStatus": "配送中",
        "ReceiveTime": "2021-09-07T05:57:00",
        "Station": "逢甲營業所"
      },
      {
        "WaybillId": "906999479515",
        "GoodStatus": "轉運中",
        "ReceiveTime": "2021-09-07T00:02:00",
        "Station": "中區轉運中心(區)"
      },
      {
        "WaybillId": "906999479515",
        "GoodStatus": "已集貨",
        "ReceiveTime": "2021-09-06T18:54:00",
        "Station": "北二特販二所"
      }
    ]

參考資料

Permalink study/anglesharp/20250309-001/index.txt · Last modified: 2025/03/09 11:43 by jethro

oeffentlich