Overview

This is a database of Internet places. Mostly domains. Sometimes other things. Think of it as Internet meta database. This repository contains link metadata: title, description, publish date, etc.

Acceptable link types

domains
repository links. For example https://github.com/rumca-js/Internet-Places-Database
user spaces. Might be youtube channel link: Linus Tech Tips YouTube Channel. Might be X/Twitter user account

Not acceptable link types

malware sites
porn, casino, gambling etc.
IT infrastructure domains, CDN domains
analytic domains that are used for user surveillance

Some zen rules:

Anything not obeying the law will be removed from lists
Internet operates in ... many countries, so there are many laws
If things are offensive, they do not have to be removed
If page content is obnoxious, it can, and possible should be demoted
I do not always follow these rules strictly

If any link is suspicious, and should be removed, plaese create an Issue in this repository. Links are captured from the Internet automatically. I do not have resources to verify them all. Use 'votes' to see credibility of domains.

Sources of data

Obtained by the Django-link-archive web crawler.

Sources:

Alternative solutions

Files

The database is distributed as a set of JSON files. We do not want to store binary data, binary files. SQL files should be fine, but I am going with JSON files for now.

Each link contains a set of attributes, like:

title
description
page rating
date of creation
date of last seen
etc.

Page rating

Content ranking is established by the Django link archive project.

To have a good page rating, it is desireable to follow good standards:

Schema Validator
W3C Validator
Provide HTML meta information. More info in Open Graph Protocol
Provide valid title, which is concise, but not too short
Provide valid description, which is concise, but not too short
Provide valid publication date
Provide valid thumbnail, media image
Provide a valid HTML status code. No fancy redirects, JavaScript redirects
Provide RSS feed. Provide HTML meta information for it https://www.petefreitag.com/blog/rss-autodiscovery/
Provide search engine keywords tags

Your page, domain exist alongside thousands of other pages. Imagine your meta data have an impact on your recognition, and page ranking.

Remember: a good page is always ranked higher.

You may wonder, why am I writing about search engine "keywords" meta field, if Google does not need them. Well I don't like Google. If we want alternative solutions to exist, it should be possible to easily find your page from simpler search engines. Provide keywords field if you support open web.

Notes

Not all domains have to be stored here. I think it would be best to have valuable domains. Certainly we do not want content farms. We do not need sites that do not contribute anything useful to the society, to the reader
The distinction is not that clear-cut, but more lenient rules apply toward personal sites
I am not that interested in marking substack, or medium as "personal" sites, as I do not feel that it should be tagged as such

Demo database

Might not be working. Used for development: https://renegat0x0.ddns.net/apps/places/.

Name		Name	Last commit message	Last commit date
Latest commit History 253 Commits
images		images
permanent		permanent
sources		sources
LICENSE		LICENSE
LICENSE_DATA		LICENSE_DATA
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Licenses found

images

images

permanent

permanent

sources

sources

LICENSE

LICENSE

LICENSE_DATA

LICENSE_DATA

README.md

README.md

Repository files navigation

Overview

Acceptable link types

Not acceptable link types

Sources of data

Alternative solutions

Files

Page rating

Tags

Notes

Demo database

About

Licenses found

Releases

Packages

License

Licenses found

rumca-js/Internet-Places-Database

Folders and files

Latest commit

History

Repository files navigation

Overview

Acceptable link types

Not acceptable link types

Sources of data

Alternative solutions

Files

Page rating

Tags

Notes

Demo database

About

Topics

Resources

License

Licenses found

Stars

Watchers

Forks