時(shí)間:2023-06-11 02:39:02 | 來(lái)源:網(wǎng)站運(yùn)營(yíng)
時(shí)間:2023-06-11 02:39:02 來(lái)源:網(wǎng)站運(yùn)營(yíng)
一款開源且超好用的網(wǎng)站克隆機(jī) HTTrack:Ctrl + C / Ctrl + V
去扒下內(nèi)容,然而我并非是鼓勵(lì)這種扒網(wǎng)站的行為,不過(guò)只要是開源的,不用于商業(yè)用途,我覺(jué)得大家都可以相互借鑒一下的啦,畢竟重復(fù)造輪子的事情就是在浪費(fèi)時(shí)間。而通過(guò) Ctrl + C / Ctrl + V
這種方式過(guò)于麻煩,并且現(xiàn)在的圖片基本上都有防盜鏈了,或者路徑由圖床改成了本地的相對(duì)路徑,單純的復(fù)制粘貼很難把網(wǎng)站的相關(guān)內(nèi)容扒的干凈,于是我們有了如下的思考:如何才能將一個(gè)網(wǎng)站的內(nèi)容完整的 clone 下來(lái)呢?view-source:https://xxx.xxx.xxx
這種方式查看網(wǎng)頁(yè)的源代碼,新建一個(gè) index.html 文件,然后將內(nèi)容復(fù)制粘貼到 index.html 內(nèi)容中,或者直接 wget 下來(lái)也是可以的。但就像我上面說(shuō)的那樣,沒(méi)有辦法完整的拷貝網(wǎng)頁(yè)上的全部?jī)?nèi)容。隨著學(xué)習(xí)的深入,了解到了 python 爬蟲的時(shí)候,有過(guò)這種實(shí)例,但是在實(shí)現(xiàn)效果上并不是那么友好。WebZip
、awwwb.com
等等,據(jù)說(shuō)挺好用的,咱也沒(méi)試過(guò),咱也不確定啊(多年不用 Windows )。今天我給大家介紹一款開源且超好用的網(wǎng)站克隆機(jī) httrack
。# Debian/Ubuntu下安裝sudo apt install httrack# CentOS/Fedora下安裝sudo yum install httrack# Gentoo下安裝sudo emerge httrack
sudo port install httrack# 或者brew install httrack
git clone https://github.com/xroche/httrack.git --recursecd httrack./configure --prefix=$HOME/usr && make -j8 && make install
具體參考:http://www.httrack.com/page/2/en/index.htmlhttrack --help
查看。https://progit.bootcss.com/
為例,來(lái)演示其操作過(guò)程。Welcome to HTTrack Website Copier (Offline Browser) 3.49-2Copyright (C) 1998-2017 Xavier Roche and other contributorsTo see the option list, enter a blank line or try httrack --help# 1. 輸入待生成的項(xiàng)目名稱Enter project name :progit# 2. 輸入待保存的項(xiàng)目所在的路徑Base path (return=/Users/apple/websites/) :/Users/apple/Desktop# 3. 輸入需要克隆的網(wǎng)站的 urlEnter URLs (separated by commas or blank spaces) :https://progit.bootcss.com/Action:(enter) 1 Mirror Web Site(s) 2 Mirror Web Site(s) with Wizard 3 Just Get Files Indicated 4 Mirror ALL links in URLs (Multiple Mirror) 5 Test Links In URLs (Bookmark Test) 0 Quit:# 4. 沒(méi)有特別要求直接回車即可Proxy (return=none) :You can define wildcards, like: -*.gif +www.*.com/*.zip -*img_*.zip# 5. 沒(méi)有特別要求直接回車即可Wildcards (return=none) :You can define additional options, such as recurse level (-r<number>), separated by blank spacesTo see the option list, type help# 6. 沒(méi)有特別要求直接回車即可Additional options (return=none) :---> Wizard command line: httrack https://progit.bootcss.com/ -O "/Users/apple/Desktop/progit" -%vReady to launch the mirror? (Y/n) :YMirror launched on Thu, 15 Aug 2019 11:54:40 by HTTrack Website Copier/3.49-2 [XR&CO'2014]mirroring https://progit.bootcss.com/ with the wizard help..Done.Thanks for using HTTrack!*
關(guān)鍵詞:
客戶&案例
營(yíng)銷資訊
關(guān)于我們
客戶&案例
營(yíng)銷資訊
關(guān)于我們
微信公眾號(hào)
版權(quán)所有? 億企邦 1997-2025 保留一切法律許可權(quán)利。