W.I.S.E. 的全称是 Wdxtub’s InSight Engine，目的是打造一个人工智能管家，是一个我做了快五年的梦，现在，我终于准备好了。
- 2017.01.04: 更新技术栈
- 2016.10.17: 完成初稿
W.I.S.E. 是一个引擎，通过收集互联网中的相关信息和我个人的各种行为，帮助我高效完成『信息筛选 -> 知识提取 -> 洞悉洞察』这一整套流程，辅助我把自己的思维强度和学习能力提高到更高层次。
- Linux / Unix 的基本概念和常用命令
- Web 服务、关系型与非关系型数据库、缓存、安全、网络模型
- 接口设计、客户端开发、SDK 接入、界面与交互设计
让我们先跑一下题，看看为什么我会想要做 W.I.S.E. 这个项目
无论是本科还是研究生，一到要决定毕业设计的时候，我那颗不务正业的心就开始沸腾。虽然一直以来都在做图形图像相关的研究，但说来奇怪，学生生涯中唯二的毕业论文都跟图形图像没有半点关系。本科的时候做的是自动问答引擎，比方说我输入『中国的首都是哪里』，那么引擎给出的答案应该是『北京』。研究生的时候做的是一个叫 WKK 的东西，全称是 Wdx’s Knowledge Kit，当时我沉醉于构建自己的知识图谱，满脑子想着是如何形成『信息/知识闭环』。
当然，这两次尝试都不能算成功，堪堪做到可 demo 的程度，便被毕业的假期勾走了魂儿。现在想想，如果说本科的尝试带我走入了知识表示的大门，那么研究生的尝试则是一次结合极简主义的化学反应。
第一是 JARVIS 这个组件（就是钢铁侠里的智能助理的名字），包括中英文的自然语言处理、笔记和书籍的推荐以及互联网信息抓取三大功能。自己做过一次，现在对于自然语言处理、推荐系统和爬虫都有了一定认识。
第二是笔记系统，也就是结合文件/文件夹，基于 Markdown 和 GTD 的一套体系。虽然这个系统随着时间的推移基本上已经完全变了模样，但是一切的开始还是要源于 WKK 这个项目。Github 中从 WDXPeak 到 Note 1.0/2.0/3.0，都可以看做是这套系统的延伸。
第三是各种功能模块，比方说 Kindle/Evernote/Web 相关的各种脚本以及基本的推送服务。尤其是 Kindle 相关脚本，我现在依然在使用。
虽然 WKK 早已停止开发，但是对于更高效的信息处理、知识获取和能力转化的追求从未停止。最近一直在做数据平台相关的工作，今天上班路上思考《数据科学工程师指南》一文中提到的『自我数据管理平台』的时候忽然想起 WKK。
起名字是软件工程中最难的问题，推敲许久，终于确定了使用 W.I.S.E. 作为项目的名称，一来是自带『聪明』的意思，二来其全称 Wdxtub’s InSight Engine 恰好特别符合项目的主题。InSight 可以看作是 Knowledge 的下一形态，意味着是 WKK 的继承者和超越者，Engine 则说明这是一个引擎（对，又回到了本科毕业设计的思路），给各种应用提供强劲动力。
- Go, Beego
- Elasticsearch, MySQL, Redis
- 预计很长一段时间都只会有 Web 版本，后面的平台看心情
看起来内容很多，其实这都是『闭环』中的一部分，现在没有了时间限制，就一点一点慢慢做吧。更多详情会同步更新到博客和 Github 中，今天就算是正式启动了！
This is a local machine tool that help me collect valuable information from the specific web sources, manage all my notes and turn them into knowledge.
The quick development of computer and Internet bring us a brand new life style. After several years of “smart” life, I feel frustrated to find out that I’m trapped in a digital castle which has lots of beautiful rooms. The rooms are so beautiful that most of the time I don’t want to leave. But if one day I lose my smart phone or my computer breaks down, the great castle disappears and life becomes really tough.
On one hand, my castle is my home that makes me feel comfortable. But on the other hand, the castle is my prison which keeps me inside. The longer I have lived there, the harder I can get rid of it.
In my opinion, instead of becoming a beautiful castle that lock me down, those smart devices should be the tools that help me lead a life with more choices and opportunities. As it is, I make this tool to give myself more freedom to be who I am. It is the tool that learns to suit me, not I learn to use the tool.
With the help of WKK, I want to connect all my digital devices with ease and find the relations between all my notes as well as ebooks. It is a magic box that mix books, news, websites and notes together and organize them in a natural way so that I can search and unify them easily. This is the process of building my framework of knowledge.
The main focus of WKK is Minimalism. It is a different life style from what I lead now.
Minimalism is a tool that can assist you in finding freedom. Freedom from fear. Freedom from worry. Freedom from overwhelm. Freedom from guilt. Freedom from depression. Freedom from the trappings of the consumer culture we’ve built our lives around. Real freedom.
So I want to make WKK as simple and elegant as possible, the less dependency the better. In that case, I can put more energy to those things that really matters.
Besides minimalism. There are another six keywords for WKK:
Personal, Smart, Automatic, Efficient, Sharing, Creating
Minimalism has helped us:
- Eliminate our discontent
- Reclaim our time
- Live in the moment
- Pursue our passions
- Discover our missions
- Experience real freedom
- Create more, consume less
- Focus on our health
- Grow as individuals
- Contribute beyond ourselves
- Rid ourselves of excess stuff
- Discover purpose in our lives
- Offline. My knowledge base doesn’t rely on the Internet. The core of WKK works even without Internet. So I don’t have to worry about the environments. Battery is the limit.
- Lightweight. The minimal requirements is just a text editor. I don’t need specific software to use WKK. Python is the only requirement for running most of the features of WKK. Less dependencies, Longer battery time.
- Elegant. With the help of markdown and pandoc, I can generate beautiful documents with markup plain text. Still no need to use other applications, just a text editor. With a better note mechanism, I can take note with more efficiency and transfer to knowledge more easily.
- Automatic. Most of the dirty work will be done automatically. WKK is the tool that make me feel better.
- Portable. No private format or strange arrangement. WKK uses directories and files to manage everything. So it’s highly portable.
- Smart. WKK will learn my habits and find useful information related to specific topics. I don’t have to waste so much time on filtering informations everyday.
As WKK need to handle Chinese and English at the same time. A robust method is what I need. The core of WKK is a high speed web crawler, a directory-based local server and a analyzing/recommending system. Most of the codes are written in python which means I can enable/disable them easily. Here is some of the package I use for WKK:
- Scrapy: The web crawler
- Flask: The web framework
- Jieba: Chinese words segmentation
- NLTK: Language toolkit from Stanford
- NumPy, SciPy, Matplotlib
And several other packages
Here are the features that WKK support:
- Collect: Notes, RSS, Wiki, News, Social Network
- Improve: Detect Categories, Extract Keywords, Generate Label
- Arrange: Topic Analysis, Note Linking
- Connect: Build Knowledge Base, Find Connections
- Generate Catalog
- Convert to mobi/epub/pdf/html
- Customized CSS
- Support Latex math expressions
- (TODO)other markup language support
- Note System
- HTML to Markdown(from the Internet)
- Link to different notes
- Theme note, consist of several notes and form a key idea
- Topic Crawler, update related topics from rss/blog/weibo everyday
- Auto Tag
- Auto Keywords
- Similar Note
- GTD style arrangement(Inbox Folder, Knowledge Folder)
- Note Rating
- Import/Export to different application(e.g. evernote)
- Recommend related books according to the content of the notes
- Generate Knowledge Graph
- Find keywords for ebooks
Here is the list that support WKK:
- Machine Translation
- Note cluster
- Multi-document generic text summarization
WKK is the main digital tool for minimalism life. I used to use the following applications to take notes record my ideas and arrange my ebooks:
- Evernote: Collect notes from Internet and wechat
- Wunderlist: GTD tool
- Calibre: ebook management
But these days I has so many digital devices:
- Android Phone
- Windows Phone
- Windows Notebook
It is so difficult to find a tool that can connect those different platforms together with user-friendly applications. A more common situation is that some of them work well on some platform but different platforms have their own logics which make things complex.
What’s more, with the mentioned applications, different kinds of data are separate. They are not connected! If notes are not connected, how can I extract the relations between them. If there’s no relations, how can I turn them into knowledge?
So what do I need?
- A directory-based markdown(plain text) note system
- I can take notes, write blogs, etc
- Don’t need to install anything to use the basic function. A portable disk can be my whole knowledge base
- All platforms covered import and export toolkit
- I can easily continue my work in different devices and situations
- Easily integrated with cloud service such as Github and Dropbox
- Automatic Information Crawler
- Given specific topic, find related books and news
- Concept definition from wiki
- Relation Finder
- Find similar notes or web information
- Extract Keywords
- Logic Combiner among different topics
This is why I make WKK.
Minimalism is a tool used to rid yourself of life’s excess in favor of focusing on what’s important so you can find happiness, fulfillment, and freedom.
When I know minimalism, it seems that what I have been pursuing for such a long time finally got an answer. So I want to make WKK in a minimalism way.
- Personal: Except for the common language database from the internet, all the training data is my notes, my reading as well as my notes. As it is, WKK will be a very personal toolkit that knows your language pattern and favorite topics.
- Smart: The best part of WKK is that the longer I use it, the better it understands me. With the help of researches on big data, it can give me useful recommendations related to the topics and themes I’m interested in.
- Automatic: Tools are the helper that do most of the dirty work for human. So is WKK. Most of the features can work automatically as if it doesn’t exist. I can just use it in a natural way. The only thing I need to do is focusing on learning and thinking. The rest will be handled by WKK automatically.
- Efficient: There are so much information every moment everyday: Mail, News, Twitter, Facebook, Weibo, Wechat, etc. WKK helps me to gather all the information together and gives me a daily report so that I don’t need to waste so much time on filtering information. WKK did it for me.
- Sharing: It’s so easy to share my notes or knowledge with others with the help of WKK. Here is the supported formats: html, epub, mobi, azw3, pdf, jpg, email, etc.
- Creating: With the knowledge graph generated by WKK. I can find related topics and themes easily. With new connections, new thoughts and ideas can always be available in my brain.
Here is the philosophy of WKK, also the philosophy of minimalism in my mind.