Categories

Deep-learning

  • Hyperparams tuning of Reinforcement Learning in Tensorflow

    Hyperparams tuning is an essential step towards optimal performance of your neural network. Instead of lowly efficient tuning by hand, a systematic and clustered solution is far away efficient and consistent. This blog shows how to use TensorBoard Hyperparam plugin to search optimal hyperparams.

    Hyperparams

    Hyperparams are the predefined skeleton of your learning model. Contrasted with weights of neurons,...

  • GAN

    Source code: Everyone can run the code in Colab envrionment with the GPU.

    Generative Adversarial Network (GAN) is a generative framework besides variationall autoencoders which introduces the minmax game theory into unsupervised learning. I have implementated the original GAN propsed by Ian Goodfellow and another improved version for the MNIST digits dataset...

  • Benchmark of Popular Reinforcement Learning Algorithms

    This post is a reflection of my study through OpenAI’s tutorial, Spinningup, of deep reinforcement learning. It mainly covers six popular algorithms including Vanilla Policy Gradient (VPG), Trust Region Policy Optimization (TRPO), Proximal Policy Optimization (PPO), Deep Deterministic Policy Gradient (DDPG), Twin Delayed DDPG (TD3) and Soft Actor-Critic (SAC). I have implemented all these algorithms under the guidance of...

  • Install GPU-enabled Tensorflow with Ubuntu 18.04

    Ubuntu 18.04 is released recently. Many developers will adapt the Tensorflow development environment with the new GNOME style’s Ubuntu OS. Here I show my successful installation of Tensorflow with GPU in Ubuntu 18.04. Hope it can help you out of the mess for multiple platforms’ compatability.

    Platforms

    • Ubuntu 18.04

    • Tensorflow 1.7

    • CUDA 9.1

      ...
  • Understansing the forward and backward pass of LSTM

    Recurrent Neural Network(RNN) is a specific learning approach for the sequence generation. It is naturally applied in natural launguage processing(NLP), image captioning and time series prediction. One difficulty to train the RNN is that the gradient will be vanished in the Vanilla RNN, since the gradient of hidden state (\(h\)) involves many multiplication of the weight matrix(\(W\)). Long Short Term...

  • Understansing the forward and backward pass of Batch Normalization

    Batch normalization, as it is proposed in [1], is a popular technique in deep learning to speed up the training progress and reduce the difficulty to train deep neural networks. As the authors in [1] hypothsize that the shifted distribution of the features may causes the training much harder, especially at deep layers. The deep learning methods usually works better...


Control

  • Gaussian filtering

    This notebook introduces three common gaussian filters, Kalman filter, extended Kalman filter and unscented Kalman filter. Each filter has been implemented in Python and tested with a simple 3-order system. It highlights the feature of each filter and compares the performance between EKF and UKF. The thorough theory behind can be found at Wiki.

    Note that the implementation of...


Robotics


Tooling

  • How to keep tools alive in China (incomplete)

    One trick of GFW is DNS pollution. We need to find the correct domain and ip address directly in host file explicitly.

    coursera

    1. Go to https://www.ipaddress.com/
    2. search d3c33hcgiwev3.cloudfront.net
    3. sudo vim /etc/hosts
    99.84.170.73 d3c33hcgiwev3.cloudfront.net 99.84.170.89 d3c33hcgiwev3.cloudfront.net 99.84.170.134 d3c33hcgiwev3.cloudfront.net 99.84.170.230 d3c33hcgiwev3.cloudfront.net 
    1. Restart sudo /etc/init.d/networking restart

    <h2…


Culture

  • Skills of Software Developers

    Following are skills important for software developers beyond coding. These skills are biased in the advanced technology field, not completely feasible in the internet companies.

    1. Think before talking. Talk in a coherent way.

    2. Keep preciseness of academic logic.

    3. Be willing to take responsibility when you make suggestions or complaints.

    4. Jump out of...


Deep

  • How to use Docker

    • Container: Isolated process on the machine.
    • Image: all dependecies, configuration, scripts, binaries, filesystem needed to run an app.
    • Dockerfile: text based script of instructions used to create an image
    • Volumes: connect specific filesystem paths of container back to host

    Start a container from an image

    docker run -d -p 80:80 docker/getting-started...
    

learning

  • How to use Docker

    • Container: Isolated process on the machine.
    • Image: all dependecies, configuration, scripts, binaries, filesystem needed to run an app.
    • Dockerfile: text based script of instructions used to create an image
    • Volumes: connect specific filesystem paths of container back to host

    Start a container from an image

    docker run -d -p 80:80 docker/getting-started...
    

Book

  • 《数学之美》有感一

    趁着五一假期,一口气读完了吴军老师的《数学之美》最新版。记得自己第一次接触这本书是在大学期间,有一门课推荐了此书,但是当时非计算机专业的我,实在看不进去,就只好去吃灰了。8年后的今天,再次拿起此书,发现以前晦涩难懂的例子成为了我工作的一部分,以前看着头大的数学公式,现在看去不由得惊呼奇妙,尤其是,作者在人文、数学、技术、工程四个维度的起承转合,在深深佩服作者认知高度的同时,也引发自己对技术的再次思考。

    书中第二章“自然语音处理-从规则到统计”是最引发我共鸣的内容。作者详细叙述了自然语音处理从20世纪处以规则主导到当下深度学习时代的巨大转变。在自然语音处理开端,科学家普遍认为“要想让计算机完成翻译或者语音识别这样的任务,就必须让计算机理解自然语言,而做到这一点就必须让计算具备人类这样的智能。”于是,词性分析、句子成分分析(主谓宾)、语法等诸多语言术语被引入到自然语音处理中,无数的科学家(包括图灵奖获得者)花费了数十年的时间尝试按照人类学习语言的方式教会计算机认识语言。这项任务是复杂艰巨的,比如:名词、动词归类,一词多义,语法是否成立。每一项都需要翻译成计算机语言,形成一条条的规则。但是,其中几个硬骨头一直没能通过规则很好的解决,多义性就是其中之一。词的含义往往随上下文而确定,规则的方法很难也不可能覆盖、穷举所有上下文场景,这就造成一个尴尬的结果,美国Bush总统会被翻译成灌木丛总统,北京大学生会翻译成Peking University Students。

    1970年在Frederick Jelink以及IBM Watson 实验室的工作中,给予统计的方式逐渐完胜了基于规则的思路。统计语言学的思想很直接:无论是机器翻译还是语音识别,就看句子本身的可能性大小如何,统计学语言模型就是最大似然估计,它尝试从训练语言材料中,找到能够使得它们概率最大的一组模型参数。为了能够在有限时间内找到这样的模型,隐马尔科夫链、EM算法流行起来。

    在自动驾驶技术栈中,有一个环节或许也需要上面类似自然语言处理的变革,决策。几乎在所有的自动驾驶公司中,行为决策环节都是靠规则来判断驾驶行为的,最常见的就是有限状态机,根据自车状态、周围车辆状态、地图信息,给出变道、汇入、汇出等决策信息。有些团队会超前仿真(simulation forward),从当前时刻构建出一颗树,节点就是未来时刻的环境状态,边就是行为,通过类似A*算法获取最佳行为, Tesla的MCTS算法基本就是这种思路。我之前也一直沿着这种思路解决决策问题,但很快发现起致命弱点:泛化性差、设计难度剧增。

    泛化性指自动驾驶系统在不同场景、不同时间段中的表现是否一致。有些自动驾驶系统在高速上不错,但是在城区就频频犯错,还有一些系统在demo道路上不错,但是,换条路就有明显的性能下降。原因在于,很多决策的逻辑是硬生生的if-else,无法覆盖所有可能驾驶场景。这就会导致很多长尾问题无法解决。

    设计难度在于决策系统的判定规则越来越多,导致整个决策系统越来越臃肿且晦涩难懂。很多决策系统都是用代价函数(cost function)来估计每种决策行为的影响,这个代价函数随着场景复杂度的提升,可能包含数十个维度,比如:速度、换道、碰撞、道路优先级。这些维度好在还有真实的物理含义,但是,开发人员往往也会引入很多生硬的维度,比如:换道次数、换道时间长度等,这些代价函数的调参也很麻烦,好一点的有仿真系统去做优化,大多数还是凭感觉。

    正如《数学之美》书中提到的,一个好的系统应该在数学上是简约的,无论是基于MDP的决策模型,还是基于优化思想的模型,似乎还没有将行为决策过程讲清楚。而在工业界,大家普遍的认知也是将人类驾驶车辆的方法通过代码传递给计算机,我们就像书中提到的“鸟飞派”那样,认为“飞机只有像鸟儿一般,才能飞翔”,或许基于统计模型的决策系统会有奇妙的结果。