Knowledge Basev1.3.0

Website Crawling

Website Crawling (2025-12-30): Honestify can ingest public content from your portfolio, blog, or GitHub Pages to enrich your knowledge base without manual copy-paste.

December 30, 2025·4 min read·Honestify Engineering

Feature area: Website Crawler

On this page

This release covers Website Crawling Version 1.3.0, shipped 2025-12-30. Status: shipped. No breaking changes.

Summary

Honestify can ingest public content from your portfolio, blog, or GitHub Pages to enrich your knowledge base without manual copy-paste.

Engineers with strong public writing still retyped URLs and summaries into Honestify—duplicate work that discouraged knowledge base adoption. A respectful crawler fetches public pages, extracts main content, and indexes text for AI retrieval with user-approved URL allowlists.

What Changed

URL allowlist

New

Users specify domains and max crawl depth before indexing starts.

Content extraction

New

Readability-style main-content detection strips nav and boilerplate.

Crawl scheduling

Improved

Periodic re-crawl option for sites that update frequently.

KB setup time

Before

~25 min

After

~12 min

Users with 5+ public pages

URL allowlist — Users specify domains and max crawl depth before indexing starts.
Content extraction — Readability-style main-content detection strips nav and boilerplate.
Crawl scheduling — Periodic re-crawl option for sites that update frequently.

Why We Built It

Engineers with strong public writing still retyped URLs and summaries into Honestify—duplicate work that discouraged knowledge base adoption.

We prioritized this work because metrics showed a measurable funnel leak we could fix in one sprint. The fix needed to be durable—not a patch—so we addressed root causes in Website Crawler rather than symptoms alone.

Engineers, recruiters, and hiring managers all benefit when Honestify behaves predictably in production. This release reflects that bar.

User Impact

Knowledge base setup time decreased 50% for users with existing portfolio sites.

Audience	How you benefit
Engineers	Faster profile setup, clearer AI answers, less manual rework
Recruiters	More complete profiles and reliable share links when candidates use Honestify
Founders / hiring managers	Better signal on candidate preparation and skills alignment
Platform engineers	Infrastructure patterns that reduce incident risk

Relevant skills: rag, python, system design, typescript. Target roles: ai engineer, backend engineer, devops engineer.

Technical Highlights

Robots.txt compliance with crawl-delay respect
Per-user rate limits and concurrent crawl caps
Chunking pipeline for embedding index
Dedup via content hash

Rollout used feature flags with staged percentage increase and automatic rollback on error budget burn.

Before

Website Crawling: before vs after

Before

Users manually pasted article text or skipped external content entirely.

After

Submit a root URL; Honestify discovers and indexes linked pages within configured depth and domain bounds.

Users moving from the previous experience should notice submit a root URL; Honestify discovers and indexes linked pages within configured depth and domain bounds.

Screenshots

Future Improvements

What we are building next

GitHub README indexing
RSS feed subscriptions
Selective page exclusion UI

Known limitations

· JavaScript-heavy SPAs may require manual paste fallback
· Authentication-gated pages not supported

Feedback welcome: Reply via in-app feedback or support—especially if you hit edge cases we did not cover in this release.

This update connects to other Honestify work:

Related updates: knowledge base launch, improved prompt engineering, ai chat improvements
Guides: how to learn ai engineering, writing better documentation, building a personal brand
Research: rag adoption, mcp adoption, emerging technologies on honestify
Practice questions: explain rag, explain embeddings, design ai chatbot

Create your own AI profile

Upload your resume, add expertise, and share a profile link beside LinkedIn so recruiters can ask follow-up questions before the interview.

Create your AI profile Upload your resume

Related updates

Knowledge Base Launch

Improved Prompt Engineering

AI Chat Improvements

Related roles

Related skills

Related guides

Related questions

Related research

Create your own AI profile