java爬虫爬取网页指定图片保存到本地程序代码-QQ沐编程

程序介绍

以下是一个使用 Java 进行网页图片爬取并保存到本地的示例代码，代码保存为 ImageCrawler.java 文件，并替换 url 和 savePath 变量的值为你想要爬取的网站和保存图片的路径。然后使用 Java 编译器编译并运行该文件即可开始爬取网页中的图片并保存到本地。请注意，爬虫涉及到对网站内容的访问和数据抓取，需要遵守相关法律法规和网站的使用协议。在实际使用时，请确保你有合法的权限和许可，并尊重网站所有者的权益。

源代码

import java.io.*;
import java.net.URL;
import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class ImageCrawler {

    public static void main(String[] args) {
        String url = "https://example.com"; // 目标网站的URL
        String savePath = "path/to/save/images/"; // 图片保存路径

        try {
            // 发起 HTTP 请求获取网页内容
            BufferedReader reader = new BufferedReader(new InputStreamReader(new URL(url).openStream()));
            StringBuilder webpageContent = new StringBuilder();
            String line;
            while ((line = reader.readLine()) != null) {
                webpageContent.append(line);
            }
            reader.close();

            // 正则表达式匹配网页中的图片链接
            String imgPattern = "<img[^>]+src\\s*=\\s*['\"]([^'\"]+)['\"][^>]*>";
            Pattern pattern = Pattern.compile(imgPattern);
            Matcher matcher = pattern.matcher(webpageContent.toString());

            // 循环处理每个匹配的图片链接
            while (matcher.find()) {
                String imageUrl = matcher.group(1);
                downloadImage(imageUrl, savePath);
            }

            System.out.println("图片保存成功！");
        } catch (IOException e) {
            e.printStackTrace();
        }
    }

    // 下载并保存图片到本地
    private static void downloadImage(String imageUrl, String savePath) {
        try {
            URL url = new URL(imageUrl);
            InputStream in = url.openStream();
            String fileName = imageUrl.substring(imageUrl.lastIndexOf("/") + 1);
            FileOutputStream out = new FileOutputStream(savePath + fileName);

            byte[] buffer = new byte[4096];
            int bytesRead;
            while ((bytesRead = in.read(buffer)) != -1) {
                out.write(buffer, 0, bytesRead);
            }

            in.close();
            out.close();
        } catch (IOException e) {
            e.printStackTrace();
        }
    }
}

本站资源来自互联网收集，仅供用于学习和交流，请勿用于商业用途。如有侵权、不妥之处，请联系站长并出示版权证明以便删除。敬请谅解！

THE END