Jan 01, 0001
layout: “post”
title: “Hello world to Docker Mac”
date: “2016-04-15 16:34”
终于等到了Docker for Mac。如之前期待的,体验真的很棒:
- 安装简单了,标准的Mac Application

- VPN无障碍
- 原生的(osxfs)文件系统共享(其实还支持9p方式)
- Docker Application管理 xhyve VM,更改配置后会自动重启
- 速度快,在使用体验上跟在Linux上面已经差别不大
- 可以与docker toolbox共存:Docker for Mac也会像Linux上面一样监听一个
/var/run/docker.sock
,这样客户端默认情况下就会走它的API;但也可以通过环境变量告诉docker CLI调用其他Docker Daemon的API(比如docker-machine管理的vm等)

...
➦ Jan 01, 0001
Google的Fergus Henderson在Software Engineering at Google中介绍了Google的软件工程实践。
软件开发
源码仓库
- 单一源代码仓库,除了核心配置和安全相关代码,任何工程师都可以访问任何代码,并可以根据需要修改
- 所有开发都基于master分支,发布的时候才创建发布分枝
- 代码的每个子树都有owner,任何修改都需要owner批准
Blaze分布式构建系统
...
➦ Jan 01, 0001
S3故障回顾
2月28日,AWS工程师在排查Northern Virginia (US-EAST-1) Region的一个S3计费问题时,因敲错了一条playbook的参数而误删了大量的s3控制服务引发了4小时的故障。这个误操作影响了两个S3的核心系统:
...
➦ Jan 01, 0001
Gitlab故障回顾
1月31日,Giblab在修复一个PostgreSQL数据同步问题(DB Replication lagged too far behind)时,误将生产环境的数据删除(本来是计划删除db1上的数据,结果发现在错误的db2上操作了)。进而寻求从备份数据恢复,结果发现没有实时备份:
...
➦ Jan 01, 0001
Kubernetes从1.5开始,通过kops
或者kube-up.sh
部署的集群会自动部署一个高可用的系统,包括
- etcd集群模式
- apiserver负载均衡
- controller manager、scheduler和cluster autoscaler自动选主(有且仅有一个运行实例)
如下图所示
...
➦ Jan 01, 0001
LinuxKit是Docker最新发布的一个用于为容器构建安全、便携、可移植操作系统的工具包。它根据用户编写的yaml(指定kernel和基于docker image的一些列服务)自动构建一个常见虚拟化平台或云平台的虚拟机镜像,并自动运行起来。主要特性包括
- 增强安全性
- 易用、可扩展
- 所有服务均可定制,且用户服务和系统服务都是基于docker image
- 构建过程基于docker
- 基于Infrakit方便部署生成的镜像
安装
git clone https://github.com/linuxkit/linuxkit $GOPATH/src/github.com/linuxkit/linuxkit
make && make install
原理
编写yaml
LinuxKit需要编写一个yaml文件,来配置所需要的服务。可选的配置包括
...
➦ Apache的Mesos和Google的Kubernetes 有什么区别
Jan 01, 0001
Kubernetes是一个开源项目,它把谷歌的集群管理工具引入到虚拟机和裸机场景中。它可以完美运行在现代的操作系统环境(比如CoreOS和Red Hat Atomic),并提供可以被你管控的轻量级的计算节点。Kubernetes使用Golang开发,具有轻量化、模块化、便携以及可扩展的特点。我们(Kubernetes开发团队)正在和一些不同的技术公司(包括维护着Mesos项目的MesoSphere)合作来把Kubernetes升级为一种与计算集群交互的标准方式。Kubernetes重新实现了Google在构建集群应用时积累的经验。这些概念包括如下内容:
...
➦ awesome quick start
Jan 01, 0001
awesome是Linux平台出色的窗口管理器,具有速度快、界面简捷等优点。其安装也比较简单:
sudo apt-get install -y awesome awesome-extra gnome-settings-daemon nautilus
sudo apt-get install -y --no-install-recommends gnome-session
mkdir -p ~/.config/awesome
常用快捷键整理:
切换程序
切换到下一个程序:Mod4 + j
切换到上一个程序:Mod4 + k
切换到主窗口中的第一个程序:Mod4 + Ctrl + Return
切换tag
切换到上一个选择的tag:Mod4 + Esc
切换到某个指定的tag:Mod4 + 1-9
切换到前一个tag:Mod4 + Left
切换到下一个tag:Mod4 + Right
程序窗口状态修改
最大化/非最大化:Mod4 + m
浮动/平铺:Mod4 + Ctrl + Space
最小化:Mod4 + n
从最小化中恢复:Mod4 + Ctrl + n
关闭程序:Mod4 + Shift + C
程序窗口的转移和显示
转移到某个tag:Mod4 + Shift + 1-9(或在某个tag名上按Mod4+鼠标左键)
增加到某些tag:Mod4 + Shift + Ctrl + 1-9
转移到下一个窗口中的位置:Mod4 + Shift + j
转移到上一个窗口中的位置:Mod4 + Shift + k
布局修改
当前程序窗口宽度增加5%:Mod4 + Shift + h
当前程序窗口宽度减少5%:Mod4 + Shift + l
切换到下一种布局方式:Mod4 + Space
切换到上一种布局方式:Mod4 + Ctrl + Space
窗口管理
重启awesome:Mod4 + Ctrl + r
退出awesome:Mod4 + Shift + q
运行某个命令:Mod4 + r
打开awesome菜单:Mod4 + w
多显示器下的操作
切换到下一个屏幕:Mod4 + Ctrl + j
切换到上一个屏幕:Mod4 + Ctrl + k
将程序发送到下一个屏幕:Mod4 + o
bigdata
Jan 01, 0001
Awesome Big Data
A curated list of awesome big data frameworks, resources and other awesomeness. Inspired by awesome-php, awesome-python, awesome-ruby, hadoopecosystemtable & big-data.
Your contributions are always welcome!
Frameworks
- Apache Hadoop - framework for distributed processing. Integrates MapReduce (parallel processing), YARN (job scheduling) and HDFS (distributed file system).
Distributed Programming
- AddThis Hydra - distributed data processing and storage system originally developed at AddThis.
- AMPLab SIMR - run Spark on Hadoop MapReduce v1.
- Apache Crunch - a simple Java API for tasks like joining and data aggregation that are tedious to implement on plain MapReduce.
- Apache DataFu - collection of user-defined functions for Hadoop and Pig developed by LinkedIn.
- Apache Flink - high-performance runtime, and automatic program optimization.
- Apache Gora - framework for in-memory data model and persistence.
- Apache Hama - BSP (Bulk Synchronous Parallel) computing framework.
- Apache MapReduce - programming model for processing large data sets with a parallel, distributed algorithm on a cluster.
- Apache Pig - high level language to express data analysis programs for Hadoop.
- Apache S4 - framework for stream processing, implementation of S4.
- Apache Spark - framework for in-memory cluster computing.
- Apache Spark Streaming - framework for stream processing, part of Spark.
- Apache Storm - framework for stream processing by Twitter also on YARN.
- Apache Tez - application framework for executing a complex DAG (directed acyclic graph) of tasks, built on YARN.
- Apache Twill - abstraction over YARN that reduces the complexity of developing distributed applications.
- Cascalog - data processing and querying library.
- Cheetah - High Performance, Custom Data Warehouse on Top of MapReduce.
- Concurrent Cascading - framework for data management/analytics on Hadoop.
- Damballa Parkour - MapReduce library for Clojure.
- Datasalt Pangool - alternative MapReduce paradigm.
- DataTorrent StrAM - real-time engine is designed to enable distributed, asynchronous, real time in-memory big-data computations in as unblocked a way as possible, with minimal overhead and impact on performance.
- Facebook Corona - Hadoop enhancement which removes single point of failure.
- Facebook Peregrine - Map Reduce framework.
- Facebook Scuba - distributed in-memory datastore.
- Google Dataflow - create data pipelines to help themæingest, transform and analyze data.
- Google MapReduce - map reduce framework.
- Google MillWheel - fault tolerant stream processing framework.
- JAQL - declarative programming language for working with structured, semi-structured and unstructured data.
- Kite - is a set of libraries, tools, examples, and documentation focused on making it easier to build systems on top of the Hadoop ecosystem.
- Metamarkers Druid - framework for real-time analysis of large datasets.
- Netflix PigPen - map-reduce for Clojure whiche compiles to Apache Pig.
- Nokia Disco - MapReduce framework developed by Nokia.
- Pinterest Pinlater - asynchronous job execution system.
- Pydoop - Python MapReduce and HDFS API for Hadoop.
- Stratosphere - general purpose cluster computing framework.
- Streamdrill - usefull for counting activities of event streams over different time windows and finding the most active one.
- Twitter Scalding - Scala library for Map Reduce jobs, built on Cascading.
- Twitter Summingbird - Streaming MapReduce with Scalding and Storm, by Twitter.
- Twitter TSAR - TimeSeries AggregatoR by Twitter.
Distributed Filesystem
Document Data Model
- Actian Versant - commercial object-oriented database management systems .
- Crate Data - is an open source massively scalable data store. It requires zero administration.
- Facebook Apollo - Facebook’s Paxos-like NoSQL database.
- jumboDB - document oriented datastore over Hadoop.
- LinkedIn Espresso - horizontally scalable document-oriented NoSQL data store.
- MarkLogic - Schema-agnostic Enterprise NoSQL database technology.
- MongoDB - Document-oriented database system.
- RavenDB - A transactional, open-source Document Database.
- RethinkDB - document database that supports queries like table joins and group by.
Key Map Data Model
Note: There is some term confusion in the industry, and two different things are called “Columnar Databases”. Some, listed here, are distributed, persistent databases built around the “key-map” data model: all data has a (possibly composite) key, with which a map of key-value pairs is associated. In some systems, multiple such value maps can be associated with a key, and these maps are referred to as “column families” (with value map keys being referred to as “columns”).
...
➦