Java Web Crawler: Difference between revisions
Jump to navigation
Jump to search
Line 97: | Line 97: | ||
| valign="top" | | | valign="top" | | ||
* [https://stackoverflow.com/questions/44781339/ Spring Boot Web Application using Selenium WebDriver] | * [https://stackoverflow.com/questions/44781339/ Spring Boot Web Application using Selenium WebDriver] | ||
* [https://dzone.com/articles/automated-testing-with-junit-and-selenium-for-brow Automated Testing With JUnit & Selenium for Browser] | |||
* [https://stackoverflow.com/questions/17749049/ Spring <code>@CacheEvict</code> using wildcards] | |||
* [https://www.foreach.be/blog/spring-cache-annotations-some-tips-tricks Spring Cache Annotations Tips & Tricks] | |||
* [https://bonigarcia.github.io/selenium-jupiter/#quick-reference Selenium Jupiter Quick Reference] | |||
* [https://github.com/bonigarcia/selenium-jupiter Selenium Jupiter] | |||
* [https://github.com/yasserg/crawler4j Crawler4j] | |||
|} | |||
---- | ---- | ||
{| | |||
| valign="top" | | |||
* [https://hub.docker.com/r/selenium/standalone-firefox Docker Image <code>selenium/standalone-firefox</code>] | * [https://hub.docker.com/r/selenium/standalone-firefox Docker Image <code>selenium/standalone-firefox</code>] | ||
* [https://hub.docker.com/r/selenium/standalone-chrome Docker Image <code>selenium/standalone-chrome</code>] | * [https://hub.docker.com/r/selenium/standalone-chrome Docker Image <code>selenium/standalone-chrome</code>] | ||
* [https://hub.docker.com/r/selenium/standalone-opera Docker Image <code>selenium/standalone-opera</code>] | * [https://hub.docker.com/r/selenium/standalone-opera Docker Image <code>selenium/standalone-opera</code>] | ||
| valign="top" | | |||
|} | |} |
Revision as of 22:37, 10 October 2020
A web crawler, or spider, is a type of bot that's typically operated by search engines like Google and Bing. Their purpose is to index the content of websites all across the Internet so that those websites can appear in search engine results.
Selenium
docker run --detach \
--publish 4444:4444 \
--hostname firefox \
--name firefox \
--shm-size 2g \
selenium/standalone-firefox:80.0
docker run --detach \
--publish 4444:4444 \
--hostname firefox \
--name firefox \
--volume /dev/shm:/dev/shm \
selenium/standalone-firefox:80.0
docker exec -it firefox cat /etc/hosts http://localhost:4444/wd/hub |
docker run --detach \
--publish 4444:4444 \
--hostname chrome \
--name chrome \
--shm-size 2g \
selenium/standalone-chrome:85.0
docker run --detach \
--publish 4444:4444 \
--hostname chrome \
--name chrome \
--volume /dev/shm:/dev/shm \
selenium/standalone-chrome:85.0
docker exec -it chrome cat /etc/hosts http://localhost:4444/wd/hub |
docker run --detach \
--publish 4444:4444 \
--hostname opera \
--name opera \
--shm-size 2g \
selenium/standalone-opera:71.0
docker run --detach \
--publish 4444:4444 \
--hostname opera \
--name opera \
--volume /dev/shm:/dev/shm \
selenium/standalone-opera:71.0
docker exec -it opera cat /etc/hosts http://localhost:4444/wd/hub |