Abstract
We have collected a large dataset - more than 21 000 websites - through web-crawling the public resources of the Czech Internet. The proposed method for website hosting detection along with their geographic location and software were applied on the collected data to extend basic statistical information about the Czech websites published by the national domain registrar CZ.NIC. For analysis, we divided the data into nine categories to show differences between them, for example, between the public and private sector. The procedures used in this paper may also be applied for an extended analysis of websites in other countries, for example, for verification of fulfillment of legal directives to be implemented by public sector.
Original language | English |
---|---|
Pages (from-to) | 33-48 |
Number of pages | 16 |
Journal | Statistika |
Volume | 99 |
Issue number | 1 |
Publication status | Published - 2019 |
Externally published | Yes |
Bibliographical note
Statistika is an open access journal which means that all its contents is freely available without charge to the user or his/her institution. Users are allowed to read, download, copy, distribute, print, search, or link to the full texts of the articles, or use them for any other lawful non-commercial purpose, without asking prior permission from the publisher or the author (from 2017 has been all journal contents licensed under Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License BY-NC-SA).Keywords
- CZ.NIC
- Czech Republic
- Geographical location
- Hosting
- Internet
- Web content