- System : Windows 10
- Python : 3.6.0
- Java jre : 1.8.0_161
- Zookeeper : 3.4.11
- Kafka : 2.11-1.1.0
- Chromedriver
- Install chromedriver
- Refert to 如何在 Windows OS 安裝 Apache Kafka
- Install Java jre
- Install Kafka
- Install Zookeeper
- Start Zookeeper and Kafka
- Create your topics
\kafka\bin\windows\kafka-topics.bat --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic YOUR_TOPIC
- Write the Python web crawler with pykafka
- Build TF-IDF Model for each article
- pipe to the python text analytics
python newsCrawler.py p
python newsCrawler.py e
- Analytics
python analytics.py p
python analytics.py e