ahCrawler



Description

ahCrawler is a set to implement your own search on your website and an analyzer for your web content.
It consists of

  • crawler (spider) and indexer
  • search for your website
  • search statistics
  • website analyzer (SSL check, http header, short titles and keywords, linkchecker, ...)


You need to install it on your own server. So all crawled data stay in your environment. If you made fixes - the rescan of your website to update search index or analyzer data is under your control.

GNU GPL v 3.0

The project is hosted on Github: ahcrawler

Download

Last Updates


  • 2020-02-23: v0.102
    • UPDATE: fix conditions for PHP 7.4 (means: it shows no warning on PHP 7.4 anymore)
    • UPDATE: print css
    • UPDATE: langedit: add comparison of count of specifiers (means: more checks when editing language files)
  • 2020-01-19: v0.101
    • ADDED: backend: page for bookmarklet (moved from about page)
    • UPDATE: page for lang texts
    • UPDATE: css in overview pages
    • UPDATE: cli class (allow cgi-fcgi as cli too)
    • FIX: search class - remove limit before calculation of ranking
    • FIX: typo in German lang textfile
  • 2020-01-05: v0.100
    • UPDATE: search for % char in text
    • ADDED: backend: page to test search index
  • 2020-01-04: v0.99
    • UPDATE: font-awesome to 5.11.2
    • UPDATE: jquery to 3.4.1
    • UPDATE: Chart.js to 2.9.3
    • UPDATE: medoo to 1.7.8
    • UPDATE: ahcache class
    • UPDATE: cli class
    • FIX: ranking counter in search class: it did not detect a searchterm on text end
    • UPDATE: improve details for ranking in backend searchindex search
    • UPDATE: http response headers - added non-standard headers

Requirements


  • any webserver with PHP 5.5+ up to PHP 7.4 (PHP 7 is recommended)
  • php-curl (included in php-common)
  • php-pdo and database extension (sqlite or mysql)
  • php-mbstring
  • php-xml

Screenshots


Just to get a few first impressions ... :-)

Backend - statistics of the entered search terms


ahcrawler :: backend

ahcrawler :: backend

Backend - anlysis of the website

ahcrawler :: backend

ahcrawler :: backend

ahcrawler :: backend

ahcrawler :: backend

ahcrawler :: backend

ahcrawler :: backend







Copyright © 2015-2020 Axel Hahn
project page: GitHub (en)
Axels Webseite (de)
results will be here