0%

HBase 连接

以 alihbase-client 2.8.6 jar 为例

1. 连接的定义

1
2
3
4
5
A cluster connection encapsulating lower level individual connections to actual servers and a connection to zookeeper. Connections are instantiated through the ConnectionFactory class. The lifecycle of the connection is managed by the caller, who has to close() the connection to release the resources.

The connection object contains logic to find the master, locate regions out on the cluster, keeps a cache of locations and then knows how to re-calibrate after they move. The individual connections to servers, meta cache, zookeeper connection, etc are all shared by the Table and Admin instances obtained from this connection.

Connection creation is a heavy-weight operation. Connection implementations are thread-safe, so that the client can create a connection once, and share it with different threads. Table and Admin instances, on the other hand, are light-weight and are not thread-safe. Typically, a single connection per client application is instantiated and every thread will obtain its own Table instance. Caching or pooling of Table and Admin is not recommended.

这是 Connection 类的注解,总结来说连接是一个很重的操作,因为这个连接要关联 zk、server、缓存等等。

Read more »

架构二三事

1. 读写 Redis

两种结构方式,一个是星型,一个是链式,但是最根本的都有一个主(只写),一个热备(HA用),N 个只读从

Read more »

14种常见算法模型

1. 滑动窗口

1.1 下面是一些你可以用来确定给定问题可能需要滑动窗口的方法:

  1. 问题的输入是一种线性数据结构,比如链表、数组或字符串
  2. 你被要求查找最长/最短的子字符串、子数组或所需的值
Read more »

Flink内核原理与实现

  1. Master(JobManager)—Slave(TaskManager)架构

  2. JobManager 根据并行度将 Flink 应用分解为子任务,向 ResourceManager申请资源,然后分发任务到 TaskManager执行,并负责应用容错,跟踪作业的执行状态,发现异常则恢复作业等

    1. JobManager 下面分 JobMaster(解析作业)、ResourceManager、Dispatcher 三个部分
    2. JobMaster 将客户端生成的作业图转换为物理层面的执行图,并进行分发
    3. 一个 JobMaster 对应一个 JobManager,一个 JobManger 可以有多个 JobMaster
  3. TaskManager 管理子任务的启动、停止、销毁、异常恢复等。

Read more »

深入理解Kafka

  1. 分区中所有副本统称为 AR(Assigned Replicas)。所有与 leader 副本保持一定程度同步的副本(包括 leader 副本在内)组成 ISR(In-Sync Replicas),ISR 集合是 AR 集合中对应一个子集。

    一定程度同步指 follower 副本在从 leader 副本中拉取数据进行同步时,同步期间相对于 leader 副本会有一定程度的滞后。

    这个一定程度的同步是指可忍受的滞后范围,这个范围可以通过参数进行调整。与 leader 副本同步滞后过多的副本(不包括 leader 副本)组成 OSR(Out-Sync Replicas),因此 AR = ISR + OSR。正常情况下 OSR 为空,AR = ISR。

Read more »