该存储库包含有关各种技术主题的问题和练习,有时与 DevOps 和 SRE 相关
目前有2624 个练习和问题
️您可以使用这些来准备面试,但大多数问题和练习并不代表实际的面试。请阅读常见问题页面了解更多详情
? 如果您有兴趣从事 DevOps 工程师的职业,学习这里提到的一些概念会很有用,但您应该知道这并不是要学习此存储库中提到的所有主题和技术
您可以通过提交拉取请求来添加更多练习:) 在此处阅读贡献指南
开发运营 | git | 网络 | 硬件 | 库伯内斯 |
软件开发 | Python | 去 | 珀尔 | 正则表达式 |
云 | AWS | 天蓝色 | 谷歌云平台 | 开放堆栈 |
操作系统 | Linux | 虚拟化 | 域名系统 | 外壳脚本 |
数据库 | SQL | 蒙戈 | 测试 | 大数据 |
持续集成/持续交付 | 证书 | 集装箱 | 开放式班次 | 贮存 |
地形 | 木偶 | 分布式 | 您可以提出的问题 | 安西布尔 |
可观察性 | 普罗米修斯 | 圆CI |
| 格拉法纳 |
阿尔戈 | 软技能 | 安全 | 系统设计 |
混沌工程 | 杂项 | 松紧带 | 卡夫卡 | NodeJs |
网络
一般来说,为了沟通你需要什么?
- 共同语言(供两端理解)
- 一种称呼您想要与谁通信的方式
- 连接(以便通信的内容可以到达接收者)
什么是 TCP/IP?
定义两个或多个设备如何相互通信的一组协议。
要了解有关 TCP/IP 的更多信息,请阅读此处
什么是以太网?
以太网只是指当今最常见的局域网 (LAN) 类型。与跨越较大地理区域的 WAN(广域网)不同,LAN 是一个小区域(例如您的办公室、大学校园,甚至家庭)内的计算机互连网络。
什么是MAC地址?它有什么用?
MAC 地址是用于识别网络上各个设备的唯一标识号或代码。
在以太网上发送的数据包始终来自 MAC 地址并发送到 MAC 地址。如果网络适配器正在接收数据包,它会将数据包的目标 MAC 地址与适配器自己的 MAC 地址进行比较。
该 MAC 地址何时使用?:ff:ff:ff:ff:ff:ff
当设备将数据包发送到广播 MAC 地址 (FF:FF:FF:FF:FF:FF) 时,该数据包将被传送到本地网络上的所有站点。以太网广播用于在数据链路层将 IP 地址解析为 MAC 地址(通过 ARP)。
什么是IP地址?
互联网协议地址(IP 地址)是分配给连接到使用互联网协议进行通信的计算机网络的每个设备的数字标签。IP 地址有两个主要功能:主机或网络接口标识和位置寻址。
解释子网掩码并举例
子网掩码是一个 32 位数字,用于屏蔽 IP 地址并将 IP 地址分为网络地址和主机地址。子网掩码是通过将网络位设置为全“1”并将主机位设置为全“0”来形成的。在给定网络中,在可用主机地址总数中,始终保留两个用于特定目的,并且不能分配给任何主机。它们是第一个地址,被保留为网络地址(也称为网络 ID),以及最后一个地址用于网络广播。
例子
什么是私有IP地址?在哪些场景/系统设计中应该使用它?
私有IP地址分配给同一网络中的主机以实现相互通信。顾名思义,“私有”意味着分配有私有 IP 地址的设备无法被任何外部网络的设备访问。例如,如果我住在旅馆里,并且希望我的旅馆伙伴加入我托管的游戏服务器,我会要求他们通过我的服务器的私有 IP 地址加入,因为网络是旅馆的本地网络。什么是公共IP地址?在哪些场景/系统设计中应该使用它?
公共 IP 地址是面向公众的 IP 地址。如果您正在托管您希望您的朋友加入的游戏服务器,您将向您的朋友提供您的公共 IP 地址,以允许他们的计算机识别和定位您的网络和服务器,以便进行连接。有时,您不需要使用面向公众的 IP 地址,如果您正在与与您连接到同一网络的朋友一起玩,在这种情况下,您将使用私有 IP 地址。为了让某人能够连接到您位于内部的服务器,您必须设置一个端口转发来告诉您的路由器允许来自公共域的流量进入您的网络,反之亦然。解释 OSI 模型。有哪几层?每层负责什么?
- 应用程序:用户端(此处为 HTTP)
- 表示:在应用程序层实体之间建立上下文(此处为加密)
- 会话:建立、管理和终止连接
- 传输:将可变长度数据序列从源主机传输到目标主机(TCP
- 网络:将数据报从一个网络传输到另一个网络(IP 在这里)
- 数据链路:在两个直接连接的节点之间提供链路(MAC 在这里)
- 物理:数据连接的电气和物理规格(位在这里
- )
- )
您可以在 penguintutor.com 中阅读有关 OSI 模型的更多信息
对于以下各项,确定其属于哪个 OSI 层:- 纠错
- 数据包路由
- 电缆和电信号
- MAC地址
- IP地址
- 终止连接
- 3次握手
纠错 - 数据链路数据包路由 - 网络电缆和电信号 - 物理MAC 地址 - 数据链路IP 地址 - 网络终止连接 - 会话3 次握手 - 传输您熟悉哪些交付方案?
单播:一对一通信,其中有一个发送者和一个接收者。
广播:向网络中的每个人发送消息。地址 ff:ff:ff:ff:ff:ff 用于广播。使用广播的两种常见协议是 ARP 和 DHCP。
组播:向一组订阅者发送消息。它可以是一对多或多对多。
什么是 CSMA/CD?它在现代以太网中使用吗?
CSMA/CD 代表载波侦听多路访问/冲突检测。其主要重点是管理对共享介质/总线的访问,其中在给定时间点只有一台主机可以进行传输。
CSMA/CD算法:
在发送帧之前,它会检查另一台主机是否已经在发送帧。- 如果没有人在传输,则它开始传输该帧。
- 如果两个主机同时传输,就会发生冲突。
- 两台主机都停止发送帧,并向每个人发送“堵塞信号”,通知每个人发生了冲突
- 他们正在等待随机时间,然后再次发送
- 一旦每个主机等待随机时间,它们就会尝试再次发送帧,因此循环再次开始
描述以下网络设备以及它们之间的区别:
路由器、交换机和集线器都是用于连接局域网 (LAN) 中的设备的网络设备。然而,每种设备的操作方式不同,并且有其特定的用例。以下是每个设备的简要说明以及它们之间的差异:
路由器:将多个网段连接在一起的网络设备。它在 OSI 模型的网络层(第 3 层)运行,并使用路由协议在网络之间引导数据。路由器使用 IP 地址来识别设备并将数据包路由到正确的目的地。- 交换机:连接局域网上多个设备的网络设备。它在 OSI 模型的数据链路层(第 2 层)运行,并使用 MAC 地址来识别设备并将数据包定向到正确的目的地。交换机允许同一网络上的设备更有效地相互通信,并且可以防止多个设备同时发送数据时可能发生的数据冲突。
- 集线器(Hub):通过一根电缆连接多个设备的网络设备,用于连接多个设备而不需要对网络进行分段。然而,与交换机不同的是,它在 OSI 模型的物理层(第 1 层)运行,并且只是将数据包广播到与其连接的所有设备,无论该设备是否是预期接收者。这意味着可能会发生数据冲突,从而导致网络效率受到影响。现代网络设置中通常不使用集线器,因为交换机效率更高并提供更好的网络性能。
什么是“冲突域”?
冲突域是一个网段,其中设备可能会因同时尝试传输数据而相互干扰。当两个设备同时传输数据时,可能会引起冲突,从而导致数据丢失或损坏。在冲突域中,所有设备共享相同的带宽,并且任何设备都可能干扰其他设备的数据传输。什么是“广播域”?
广播域是一个网段,其中所有设备都可以通过发送广播消息来相互通信。广播消息是发送到网络中的所有设备而不是特定设备的消息。在广播域中,所有设备都可以接收和处理广播消息,无论该消息是否是发给它们的。三台计算机连接到一个交换机。有多少个冲突域?有多少个广播域?
3个冲突域和1个广播域
路由器如何工作?
路由器是在两个或多个数据包交换计算机网络之间传递信息的物理或虚拟设备。路由器检查给定数据包的目标互联网协议地址(IP 地址),计算其到达目的地的最佳方式,然后相应地转发它。
什么是NAT?
网络地址转换 (NAT) 是一种将一个或多个本地 IP 地址转换为一个或多个全局 IP 地址(反之亦然)的过程,以便为本地主机提供 Internet 访问。
什么是代理?它是如何运作的?我们需要它做什么?
代理服务器充当您和互联网之间的网关。它是一个中间服务器,将最终用户与他们浏览的网站分开。
如果您使用代理服务器,互联网流量将通过代理服务器流向您请求的地址。然后,请求通过同一个代理服务器返回(此规则也有例外),然后代理服务器将从网站收到的数据转发给您。
代理服务器根据您的使用案例、需求或公司政策提供不同级别的功能、安全性和隐私。
什么是TCP?它是如何运作的?什么是3次握手?
TCP 3 次握手或三向握手是 TCP/IP 网络中用于在服务器和客户端之间建立连接的过程。
三向握手主要用于创建 TCP 套接字连接。它在以下情况下起作用:
客户端节点通过 IP 网络将 SYN 数据包发送到同一网络或外部网络上的服务器。该数据包的目的是询问/推断服务器是否对新连接开放。- 目标服务器必须具有可以接受和发起新连接的开放端口。当服务器从客户端节点接收到 SYN 数据包时,它会做出响应并返回确认收据 – ACK 数据包或 SYN/ACK 数据包。
- 客户端节点接收来自服务器的 SYN/ACK 并用 ACK 数据包进行响应。
什么是往返延迟或往返时间?
来自维基百科:“发送信号所需的时间加上接收信号确认所需的时间”
附加问题:LAN 的 RTT 是多少?
SSL 握手如何工作?
SSL 握手是在客户端和服务器之间建立安全连接的过程。客户端向服务器发送一条 Client Hello 消息,其中包含客户端的 SSL/TLS 协议版本、客户端支持的加密算法列表以及随机值。- 服务器使用 Server Hello 消息进行响应,其中包括服务器的 SSL/TLS 协议版本、随机值和会话 ID。
- 服务器发送证书消息,其中包含服务器的证书。
- 服务器发送 Server Hello Done 消息,表示服务器已完成发送 Server Hello 阶段的消息。
- 客户端发送客户端密钥交换消息,其中包含客户端的公钥。
- 客户端发送 Change Cipher Spec 消息,通知服务器客户端即将发送使用新密码规范加密的消息。
- 客户端发送加密握手消息,其中包含使用服务器公钥加密的预主密钥。
- 服务器发送 Change Cipher Spec 消息,通知客户端服务器即将发送使用新密码规范加密的消息。
- 服务器发送加密握手消息,其中包含使用客户端公钥加密的预主密钥。
- 客户端和服务器现在可以交换应用程序数据。
TCP 和 UDP 有什么区别?
TCP 在客户端和服务器之间建立连接以保证包的顺序,而 UDP 在客户端和服务器之间不建立连接,并且不处理包顺序。这使得 UDP 比 TCP 更轻量级,并且是流媒体等服务的完美候选者。
Penguintutor.com 提供了很好的解释。
您熟悉哪些 TCP/IP 协议?
解释一下“默认网关”
默认网关充当接入点或 IP 路由器,联网计算机使用它向另一个网络或 Internet 中的计算机发送信息。
什么是ARP?它是如何运作的?
ARP 代表地址解析协议。当您尝试 ping 本地网络上的 IP 地址(例如 192.168.1.1)时,您的系统必须将 IP 地址 192.168.1.1 转换为 MAC 地址。这涉及到使用 ARP 来解析地址,因此得名。
系统保留一个 ARP 查找表,其中存储有关哪些 IP 地址与哪些 MAC 地址关联的信息。当尝试将数据包发送到 IP 地址时,系统将首先查阅该表以查看它是否已经知道 MAC 地址。如果有缓存值,则不使用 ARP。
什么是TTL?它有助于预防什么?
TTL(生存时间)是 IP(互联网协议)数据包中的一个值,用于确定数据包在被丢弃之前可以经过多少跳或路由器。路由器每转发一个数据包,TTL 值就会减一。当 TTL 值达到零时,数据包将被丢弃,并向发送方发送回 ICMP(互联网控制消息协议)消息,指示数据包已过期。- TTL 用于防止数据包在网络中无限循环,这会导致拥塞并降低网络性能。
- 它还有助于防止数据包陷入路由环路,即数据包在同一组路由器之间连续传输而永远不会到达目的地。
- 此外,TTL 可用于帮助检测和防止 IP 欺骗攻击,即攻击者尝试使用虚假或伪造的 IP 地址冒充网络上的其他设备。通过限制数据包可以传输的跳数,TTL 可以帮助防止数据包被路由到不合法的目的地。
什么是 DHCP?它是如何运作的?
它代表动态主机配置协议,为主机分配 IP 地址、子网掩码和网关。它是这样工作的:
- 主机在进入网络时会广播一条消息来搜索 DHCP 服务器 (DHCP DISCOVER)
- DHCP 服务器将提供消息作为包含租用时间、子网掩码、IP 地址等的数据包发回(DHCP OFFER)
- 根据接受的报价,客户端发回回复广播,让所有 DHCP 服务器知道(DHCP 请求)
- 服务器发送确认(DHCP ACK)
在这里阅读更多内容
同一网络上可以有两个 DHCP 服务器吗?它是如何运作的?
同一网络上可以有两个 DHCP 服务器,但是,不建议这样做,并且仔细配置它们以防止冲突和配置问题非常重要。
当在同一网络上配置两个 DHCP 服务器时,存在两个服务器将 IP 地址和其他网络配置设置分配给同一设备的风险,这可能会导致冲突和连接问题。此外,如果 DHCP 服务器配置有不同的网络设置或选项,网络上的设备可能会收到冲突或不一致的配置设置。- 但是,在某些情况下,同一网络上可能需要有两台 DHCP 服务器,例如在大型网络中,一台 DHCP 服务器可能无法处理所有请求。在这种情况下,可以将 DHCP 服务器配置为服务不同的 IP 地址范围或不同的子网,这样它们就不会相互干扰。
什么是 SSL 隧道?它是如何运作的?
- SSL(安全套接字层)隧道是一种用于通过不安全网络(例如 Internet)在两个端点之间建立安全加密连接的技术。 SSL 隧道是通过将流量封装在 SSL 连接中而创建的,它提供机密性、完整性和身份验证。
SSL 隧道的工作原理如下:
客户端发起与服务器的 SSL 连接,这涉及建立 SSL 会话的握手过程。- SSL会话建立后,客户端和服务器协商加密参数,例如加密算法和密钥长度,然后交换数字证书以相互验证。
- 然后,客户端通过 SSL 隧道将流量发送到服务器,服务器解密流量并将其转发到目的地。
- 服务器通过 SSL 隧道将流量发送回客户端,客户端解密流量并将其转发到应用程序。
什么是套接字?在哪里可以看到系统中的套接字列表?
套接字是一种软件端点,可通过网络在进程之间进行双向通信。套接字为网络通信提供标准化接口,允许应用程序通过网络发送和接收数据。要查看 Linux 系统上打开的套接字列表: netstat -an- 此命令显示所有打开的套接字的列表,以及它们的协议、本地地址、外部地址和状态。
什么是 IPv6?如果我们有 IPv4,为什么还要考虑使用它?
- IPv6(互联网协议版本 6)是互联网协议 (IP) 的最新版本,用于识别网络上的设备并与其进行通信。 IPv6地址是128位地址,以十六进制表示,例如2001:0db8:85a3:0000:0000:8a2e:0370:7334。
我们应该考虑使用 IPv6 而不是 IPv4 有几个原因:
地址空间:IPv4的地址空间有限,在世界许多地方已经耗尽。 IPv6 提供了更大的地址空间,允许数万亿个唯一的 IP 地址。- 安全性:IPv6 包括对 IPsec 的内置支持,它为网络流量提供端到端加密和身份验证。
- 性能:IPv6 包含有助于提高网络性能的功能,例如多播路由,它允许将单个数据包同时发送到多个目的地。
- 简化网络配置:IPv6 包含可简化网络配置的功能,例如无状态自动配置,允许设备自动配置自己的 IPv6 地址,而无需 DHCP 服务器。
- 更好的移动性支持:IPv6 包含可以改善移动性支持的功能,例如移动 IPv6,它允许设备在不同网络之间移动时保留其 IPv6 地址。
什么是VLAN?
- VLAN(虚拟局域网)是一种逻辑网络,它将物理网络上的一组设备组合在一起,无论其物理位置如何。 VLAN 是通过配置网络交换机将特定的 VLAN ID 分配给连接到交换机上特定端口或端口组的设备发送的帧来创建的。
什么是 MTU?
MTU 代表最大传输单元。它是可以在单个事务中发送的最大 PDU(协议数据单元)的大小。
如果发送的数据包大于 MTU,会发生什么情况?
通过IPv4协议,路由器可以对PDU进行分段,然后通过事务发送所有分段的PDU。
使用 IPv6 协议,它会向用户的计算机发出错误。
是真是假? Ping 使用 UDP,因为它不关心可靠连接
错误的。 Ping实际上使用的是ICMP(Internet控制消息协议),它是一种网络协议,用于发送与网络通信相关的诊断消息和控制消息。
什么是SDN?
SDN 代表软件定义网络。它是一种强调网络控制集中化的网络管理方法,使管理员能够通过软件抽象来管理网络行为。- 在传统网络中,路由器、交换机和防火墙等网络设备是使用专用软件或命令行界面单独配置和管理的。相比之下,SDN将网络控制平面与数据平面分开,允许管理员通过集中式软件控制器来管理网络行为。
什么是 ICMP?它有什么用?
- ICMP 代表互联网控制消息协议。它是一种用于 IP 网络中诊断和控制目的的协议。它是互联网协议套件的一部分,在网络层运行。
ICMP 消息有多种用途,包括:
错误报告:ICMP 消息用于报告网络中发生的错误,例如无法传送到目的地的数据包。- Ping:ICMP 用于发送 ping 报文,用于测试主机或网络是否可达以及测量数据包的往返时间。
- 路径MTU发现:ICMP用于发现路径的最大传输单元(MTU),这是在不分片的情况下可以传输的最大数据包大小。
- Traceroute:traceroute 实用程序使用 ICMP 来跟踪数据包通过网络的路径。
- 路由器发现:ICMP 用于发现网络中的路由器。
什么是NAT?它是如何运作的?
NAT 代表网络地址转换。这是一种在传输信息之前将多个本地专用地址映射到公共地址的方法。希望多个设备使用单个 IP 地址的组织会使用 NAT,大多数家庭路由器也是如此。例如,您计算机的私有 IP 可能是 192.168.1.100,但您的路由器将流量映射到其公共 IP(例如 1.1.1.1)。互联网上的任何设备都会看到来自您的公共 IP (1.1.1.1) 而不是您的私有 IP (192.168.1.100) 的流量。
以下每个协议使用哪个端口号?:- SSH
- 邮件传输协议
- HTTP协议
- 域名系统
- HTTPS
- 文件传输协议
- SFTP
SSH-22- SMTP-25
- HTTP-80
- 域名系统-53
- HTTPS - 443
- FTP-21
- SFTP-22
哪些因素影响网络性能?
有几个因素会影响网络性能,包括:
带宽:网络连接的可用带宽会显着影响其性能。带宽有限的网络可能会遇到数据传输速率慢、延迟高和响应能力差的情况。- 延迟:延迟是指数据从网络中的一点传输到另一点时发生的延迟。高延迟可能会导致网络性能下降,尤其是对于视频会议和在线游戏等实时应用程序。
- 网络拥塞:当太多设备同时使用网络时,可能会发生网络拥塞,导致数据传输速率慢和网络性能差。
- 丢包:当数据包在传输过程中丢失时,就会发生丢包。这可能会导致网络速度变慢并降低整体网络性能。
- 网络拓扑:网络的物理布局(包括交换机、路由器和其他网络设备的放置)可能会影响网络性能。
- 网络协议:不同的网络协议具有不同的性能特征,这会影响网络性能。例如,TCP 是一种可靠的协议,可以保证数据的传送,但由于错误检查和重传所需的开销,它也会导致性能下降。
- 网络安全:防火墙和加密等安全措施可能会影响网络性能,特别是当它们需要大量处理能力或引入额外延迟时。
- 距离:网络上设备之间的物理距离会影响网络性能,尤其是对于信号强度和干扰会影响连接和数据传输速率的无线网络。
什么是APIPA?
APIPA 是当主 DHCP 服务器无法访问时为设备分配的一组 IP 地址
APIPA 使用什么 IP 范围?
APIPA 使用的 IP 范围:169.254.0.1 - 169.254.255.254。
控制平面和数据平面
“控制平面”指的是什么?
控制平面是网络的一部分,决定如何将数据包路由和转发到不同的位置。
“数据平面”指的是什么?
数据平面是实际转发数据/数据包的网络的一部分。
“管理平面”指的是什么?
它指的是监控和管理功能。
创建路由表属于哪个平面(数据、控制……)?
控制平面。
解释生成树协议 (STP)。
什么是链路聚合?为什么使用它?
什么是非对称路由?怎么处理呢?
您熟悉哪些覆盖(隧道)协议?
什么是 GRE?它是如何运作的?
什么是VXLAN?它是如何运作的?
什么是 SNAT?
解释一下OSPF。
OSPF(开放最短路径优先)是一种可以在各种类型的路由器上实现的路由协议。一般来说,大多数现代路由器都支持 OSPF,包括 Cisco、Juniper 和华为等供应商的路由器。该协议设计用于基于 IP 的网络,包括 IPv4 和 IPv6。此外,它采用分层网络设计,其中路由器分为多个区域,每个区域都有自己的拓扑图和路由表。这种设计有助于减少路由器之间需要交换的路由信息量,提高网络的可扩展性。
OSPF 4 种类型的路由器是:
- 内部路由器
- 区域边界路由器
- 自治系统边界路由器
- 骨干路由器
了解有关 OSPF 路由器类型的更多信息:https://www.educba.com/ospf-router-types/
什么是延迟?
延迟是信息从源到达目的地所花费的时间。
什么是带宽?
带宽是通信通道的容量,用于衡量后者在特定时间段内可以处理多少数据。更多的带宽意味着更多的流量处理,从而意味着更多的数据传输。
什么是吞吐量?
吞吐量是指在一定时间内通过任何传输通道传输的实际数据量的测量。
执行搜索查询时,延迟和吞吐量哪个更重要?如何确保我们管理全球基础设施?
延迟。为了获得良好的延迟,搜索查询应转发到最近的数据中心。
上传视频时,延迟和吞吐量哪个更重要?以及如何保证这一点?
吞吐量。为了获得良好的吞吐量,上传流应路由到未充分利用的链接。
转发请求时还有哪些其他考虑因素(延迟和吞吐量除外)?
- 保持缓存更新(这意味着请求可能不会转发到最近的数据中心)
解释脊柱和叶子
“Spine & Leaf”是数据中心环境中常用的一种网络拓扑,用于连接多个交换机并有效管理网络流量。它也称为“主干-叶”架构或“叶-主干”拓扑。此设计提供高带宽、低延迟和可扩展性,使其成为处理大量数据和流量的现代数据中心的理想选择。在 Spine & Leaf 网络中,交换机有两种主要拓扑结构:
- 主干交换机:主干交换机是布置在主干层中的高性能交换机。这些交换机充当网络的核心,通常与每个叶交换机互连。每个主干交换机都连接到数据中心中的所有叶子交换机。
- 叶子交换机:叶子交换机连接到服务器、存储阵列和其他网络设备等终端设备。每个叶子交换机都连接到数据中心中的每个主干交换机。这在叶子交换机和主干交换机之间创建了无阻塞的全网状连接,确保任何叶子交换机都可以与任何其他叶子交换机以最大吞吐量进行通信。
Spine & Leaf 架构由于能够满足现代云计算、虚拟化和大数据应用的需求,提供可扩展、高性能和可靠的网络基础设施,因此在数据中心中越来越受欢迎
什么是网络拥塞?什么可能导致它?
当网络上传输的数据过多而没有足够的容量来处理需求时,就会发生网络拥塞。
这可能会导致延迟增加和数据包丢失。原因可能有多种,例如网络使用率高、文件传输量大、恶意软件、硬件问题或网络设计问题。
为了防止网络拥塞,监控网络使用情况并实施限制或管理需求的策略非常重要。
您能告诉我有关 UDP 数据包格式的信息吗? TCP数据包格式又如何呢?有什么不同?
什么是指数退避算法?它用在哪里?
使用汉明码,以下数据字 100111010001101 的代码字是什么?
00110011110100011101
给出应用层协议的示例
超文本传输协议 (HTTP) - 用于互联网上的网页- 简单邮件传输协议 (SMTP) - 电子邮件传输
- 电信网络 - (TELNET) - 允许客户端访问 telnet 服务器的终端仿真
- 文件传输协议 (FTP) - 促进任意两台机器之间的文件传输
- 域名系统 (DNS) - 域名翻译
- 动态主机配置协议 (DHCP) - 为主机分配 IP 地址、子网掩码和网关
- 简单网络管理协议 (SNMP) - 收集网络设备上的数据
给出网络层协议的示例
互联网协议 (IP) - 协助将数据包从一台机器路由到另一台机器- Internet 控制消息协议 (ICMP) - 让人们知道发生了什么,例如错误消息和调试信息
什么是HSTS?
HTTP 严格传输安全是一个 Web 服务器指令,它通知用户代理和 Web 浏览器如何通过在最开始发送并返回到浏览器的响应标头来处理其连接。这会强制通过 HTTPS 加密进行连接,而忽略通过 HTTP 加载该域中任何资源的任何脚本调用。阅读更多信息[此处](https://www.globalsign.com/en/blog/what-is-hsts-and-how-do-i-use-it#:~:text=HTTP%20Strict%20Transport%20Security %20(HSTS,以及%20返回%20到%20%20浏览器。)
网络 - 杂项
什么是互联网?它与万维网相同吗?
互联网是指网络的网络,在全球范围内传输大量数据。
万维网是一个运行在互联网之上的数百万台服务器上的应用程序,通过所谓的网络浏览器进行访问
什么是 ISP?
ISP(Internet Service Provider)是当地的互联网公司提供商。
操作系统
操作系统练习
姓名 | 话题 | 目的与说明 | 解决方案 | 评论 |
---|
前叉101 | 叉 | 关联 | 关联 | |
货叉102 | 叉 | 关联 | 关联 | |
操作系统 - 自我评估
什么是操作系统?
来自《操作系统:三个简单的部分》一书:
“负责使运行程序变得容易(甚至允许您看似同时运行多个程序),允许程序共享内存,使程序能够与设备交互,以及其他类似的有趣的东西”。
操作系统-进程
你能解释一下什么是进程吗?
进程是一个正在运行的程序。程序是一条或多条指令,程序(或进程)由操作系统执行。
如果您必须为操作系统中的进程设计一个 API,这个 API 会是什么样子?
它将支持以下内容:
创建 - 允许创建新进程- 删除 - 允许删除/销毁进程
- 状态 - 允许检查该过程的状态,无论是在运行,停止,等待等。
- 停止 - 允许停止运行过程
如何创建过程?
操作系统是读取程序的代码和任何其他相关数据- 程序的代码被加载到存储器中,或更具体地说,即流程的地址空间。
- 为程序的堆栈(又称运行时堆栈)分配内存。堆栈还由OS初始化,并具有诸如ARGV,ARGC和参数之类的数据()
- 内存是为程序的堆分配的,这是动态分配的数据所需的,例如数据结构链接列表和哈希表
- 执行I/O初始化任务,例如在UNIX/Linux系统中,每个过程都有3个文件描述符(输入,输出和错误)
- OS正在运行该程序,从Main()开始
是真是假?将程序加载到内存中(一次一次)
错误的。过去确实如此,但是今天的操作系统执行懒惰加载,这意味着首先要加载该过程所需的相关零件。
过程的不同状态是什么?
跑步 - 正在执行指令- 准备好了 - 准备运行,但由于不同的原因,它被搁置了
- 阻止 - 它正在等待一些操作完成,例如I/O磁盘请求
一个过程被阻止的原因是什么?
什么是间流程通信(IPC)?
过程间通信(IPC)是指操作系统提供的机制,该机制允许过程管理共享数据。
什么是“分享时间”?
即使使用具有一个物理CPU的系统,也可以允许多个用户处理并运行程序。这是可以随着时间共享而进行的,在这种情况下,计算资源以某种方式与用户共享,系统具有多个CPU,但实际上,这仅仅是通过应用多编程和多任务来共享的一个CPU。
什么是“空间共享”?
分享时间的相反。虽然及时共享资源持续了一段时间,然后另一个资源可以使用相同的资源使用,但在共享空间的空间中,该空间由多个实体共享,但在它们之间没有转移。
它被一个实体使用,直到该实体决定摆脱它。以存储为例。在存储中,文件是您的,直到您决定删除它。
哪个组件确定在给定时间时刻运行的过程?
CPU调度程序
操作系统 - 内存
什么是“虚拟内存”,什么用途?
虚拟内存将计算机的RAM与硬盘上的临时空间结合在一起。当RAM低时,虚拟内存有助于将数据从RAM移动到称为分页文件的空间。将数据移至分页文件可以释放RAM,因此您的计算机可以完成其工作。通常,您的计算机拥有的RAM越多,程序运行越快。 https://www.minitool.com/lib/virtual-memory.html
什么是需求分页?
需求分页是一种内存管理技术,仅在通过过程访问时才将页面加载到物理内存中。它通过按需加载页面来优化内存使用量,减少启动延迟和空间开销。但是,它首次访问页面时会引入一些延迟。总体而言,这是一种用于管理操作系统内存资源的经济高效方法。
什么是抄写的?
抄写(Cow)是一个资源管理概念,其目标是减少不必要的信息复制。这是一个概念,例如在Posix fork syscall中实现,它创建了调用过程的重复过程。这个想法:
如果在两个或更多实体之间共享资源(例如,在两个过程之间共享内存段),则不需要为每个实体复制资源,但是每个实体都有在共享资源上读取操作访问权限。 (共享段标记为仅阅读)(想一想每个实体都有指向共享资源位置的指针,可以将其删除以读取其值)- 如果一个实体会对共享资源执行写入操作,则会出现问题,因为所有其他共享的实体也将永久更改资源。 (考虑一个过程,修改堆栈上的某些变量,或在堆上动态分配一些数据,这些更改对共享资源也将适用于所有其他过程,这绝对是不良行为)
- 仅作为解决方案,如果要在共享资源上执行写操作,则首先复制此资源,然后应用更改。
什么是内核,它做什么?
内核是操作系统的一部分,负责以下任务:
是真是假?内核中的某些代码被加载到内存的保护区域中,因此应用程序无法覆盖它们。
真的
什么是posix?
POSIX(便携式操作系统接口)是一组标准,可定义类似Unix的操作系统和应用程序程序之间的接口。
说明什么是信号量及其在操作系统中的作用。
信号量是一种用于操作系统和同时编程的同步原始词,以控制对共享资源的访问。这是一种可变量或抽象的数据类型,可作为计数器或信号传导机制,用于通过多个过程或线程管理对资源访问的访问。
什么是缓存?什么是缓冲区?
缓存:通常,当过程正在读取和写入磁盘以使过程更快地使流程使用时,通常会使用缓存,通过制作可容易访问的不同程序使用的类似数据。缓冲区:RAM中的保留位置,用于持有临时目的的数据。
虚拟化
什么是虚拟化?
虚拟化使用软件在计算机硬件上创建一个抽象层,该层允许单个计算机的硬件元素 - 处理器,内存,存储等 - 将分为多个虚拟计算机,通常称为虚拟机(VMS)。
什么是虚拟机管理程序?
Red Hat:“一个虚拟机管理程序是创建和运行虚拟机(VM)的软件。一个虚拟机,有时称为虚拟机监视器(VMM),隔离了虚拟机操作系统和资源,并从虚拟机中进行资源,并启用这些操作系统的创建和管理。 VMS。”
在这里阅读更多内容
有哪些类型的管理程序?
托管管理程序和裸金属管理程序。
裸机管理程序的优势和缺点在托管中的管理程序是什么?
由于拥有自己的驱动程序并直接访问硬件组件,因此律师管理程序通常会具有更好的性能以及稳定性和可扩展性。
另一方面,对加载(任何)驱动程序可能会有一定的限制,因此托管的管理程序通常会从具有更好的硬件兼容性中受益。
有哪种类型的虚拟化?
操作系统虚拟化网络功能虚拟化桌面虚拟化
容器化是一种虚拟化吗?
是的,这是一个操作系统级虚拟化,其中内核被共享并允许使用多个隔离的用户空间实例。
虚拟机的引入如何改变了行业和应用程序的部署方式?
虚拟机的引入使公司可以在同一硬件上部署多个业务应用程序,而每个应用程序都以安全的方式彼此分开,每个应用程序都在自己的单独操作系统上运行。
虚拟机
在容器时代,我们需要虚拟机吗?他们仍然有意义吗?
是的,即使在容器时代,虚拟机仍然相关。尽管容器提供了虚拟机的轻巧替代品,但它们确实有一定的局限性。虚拟机仍然重要,因为它们提供隔离和安全性,可以运行不同的操作系统,并且对旧应用程序有益。例如,容器限制正在共享主机内核。
普罗米修斯
普罗米修斯是什么?普罗米修斯的主要功能是什么?
Prometheus是一种流行的开源系统监视和警报工具包,该工具包最初是在SoundCloud开发的。它旨在收集和存储时间序列数据,并允许使用称为promql的功能强大的查询语言来查询和分析该数据。普罗米修斯经常用于监视云本地应用,微服务和其他现代基础架构。
Prometheus的一些主要特征包括:
1. Data model: Prometheus uses a flexible data model that allows users to organize and label their time-series data in a way that makes sense for their particular use case. Labels are used to identify different dimensions of the data, such as the source of the data or the environment in which it was collected.
2. Pull-based architecture: Prometheus uses a pull-based model to collect data from targets, meaning that the Prometheus server actively queries its targets for metrics data at regular intervals. This architecture is more scalable and reliable than a push-based model, which would require every target to push data to the server.
3. Time-series database: Prometheus stores all of its data in a time-series database, which allows users to perform queries over time ranges and to aggregate and analyze their data in various ways. The database is optimized for write-heavy workloads, and can handle a high volume of data with low latency.
4. Alerting: Prometheus includes a powerful alerting system that allows users to define rules based on their metrics data and to send alerts when certain conditions are met. Alerts can be sent via email, chat, or other channels, and can be customized to include specific details about the problem.
5. Visualization: Prometheus has a built-in graphing and visualization tool, called PromDash, which allows users to create custom dashboards to monitor their systems and applications. PromDash supports a variety of graph types and visualization options, and can be customized using CSS and JavaScript.
总体而言,Prometheus是一种强大而灵活的工具,用于监视和分析系统和应用程序,并在行业中广泛用于云本地监视和可观察性。
在哪种情况下,最好不要使用普罗米修斯?
从Prometheus文档中获取:“如果您需要100%的准确性,例如每次要求计费”。
描述普罗米修斯的建筑和组件
普罗米修斯建筑由四个主要组成部分组成:
1. Prometheus Server: The Prometheus server is responsible for collecting and storing metrics data. It has a simple built-in storage layer that allows it to store time-series data in a time-ordered database.
2. Client Libraries: Prometheus provides a range of client libraries that enable applications to expose their metrics data in a format that can be ingested by the Prometheus server. These libraries are available for a range of programming languages, including Java, Python, and Go.
3. Exporters: Exporters are software components that expose existing metrics from third-party systems and make them available for ingestion by the Prometheus server. Prometheus provides exporters for a range of popular technologies, including MySQL, PostgreSQL, and Apache.
4. Alertmanager: The Alertmanager component is responsible for processing alerts generated by the Prometheus server. It can handle alerts from multiple sources and provides a range of features for deduplicating, grouping, and routing alerts to appropriate channels.
总体而言,Prometheus架构设计为高度可扩展性和弹性。服务器和客户端库可以以分布式方式部署,以支持大规模,高度动态环境的监视
您可以将Prometheus与诸如InfuxDB之类的其他解决方案进行比较吗?
与其他监视解决方案(例如InfruxDB)相比,Prometheus以其高性能和可扩展性而闻名。它可以处理大量数据,并且可以轻松地与监视生态系统中的其他工具集成在一起。另一方面,InfuxDB以其易用性和简单性而闻名。它具有一个用户友好的界面,并提供用于收集和查询数据的易于使用的API。
另一个流行的解决方案Nagios是一个更传统的监视系统,依赖于基于推动的模型来收集数据。 Nagios已经存在了很长时间,并以其稳定性和可靠性而闻名。但是,与Prometheus相比,Nagios缺乏一些更高级的功能,例如多维数据模型和强大的查询语言。
总体而言,监视解决方案的选择取决于组织的特定需求和要求。尽管Prometheus是大规模监控和警报的绝佳选择,但InfluxDB可能更适合需要易于使用和简单性的较小环境。 Nagios对于将优先于高级功能的稳定性和可靠性确定优先级的组织仍然是一个可靠的选择。
什么是警报?
在Prometheus中,当满足特定条件或阈值时,警报是触发的通知。当某些指标越过特定阈值或发生特定事件时,可以将警报配置为触发。一旦触发了警报,就可以将其路由到各种渠道,例如电子邮件,寻呼机或聊天,以通知相关的团队或个人以采取适当的措施。警报是任何监视系统的关键组成部分,因为它们允许团队在影响用户或导致系统停机时间之前主动检测和响应问题。什么是实例?什么是工作?
在Prometheus中,实例是指正在监视的单个目标。例如,单个服务器或服务。作业是执行相同功能的一组实例,例如一组服务相同应用程序的Web服务器。工作使您可以一起定义和管理一组目标。
从本质上讲,实例是普罗米修斯从中收集指标的个体目标,而作业是可以作为一个小组进行管理的类似实例的集合。
普罗米修斯支持哪些核心指标类型?
Prometheus支持几种类型的指标,包括: 1. Counter: A monotonically increasing value used for tracking counts of events or samples. Examples include the number of requests processed or the total number of errors encountered. 2. Gauge: A value that can go up or down, such as CPU usage or memory usage. Unlike counters, gauge values can be arbitrary, meaning they can go up and down based on changes in the system being monitored. 3. Histogram: A set of observations or events that are divided into buckets based on their value. Histograms help in analyzing the distribution of a metric, such as request latencies or response sizes. 4. Summary: A summary is similar to a histogram, but instead of buckets, it provides a set of quantiles for the observed values. Summaries are useful for monitoring the distribution of request latencies or response sizes over time.
Prometheus还支持各种功能和操作员,用于汇总和操纵指标,例如总和,最大,最小值和速率。这些功能使其成为监视和警报系统指标的强大工具。
什么是出口商?它有什么用?
出口商是第三方系统或应用程序与普罗米修斯之间的桥梁,使Prometheus可以从该系统或应用程序中监视和收集数据。出口商充当服务器,在特定的网络端口上聆听Prometheus的请求到刮擦指标。它从第三方系统或应用程序收集指标,并将其转换为普罗米修斯可以理解的格式。然后,出口商通过HTTP端点将这些指标暴露于Prometheus,从而可以收集和分析。
出口商通常用于监视各种类型的基础架构组件,例如数据库,Web服务器和存储系统。例如,有可用于监视流行数据库(例如MySQL和PostgreSQL)的导出器,以及Apache和Nginx等Web服务器。
总体而言,出口商是普罗米修斯生态系统的关键组成部分,可以监视广泛的系统和应用程序,并为平台提供了高度的灵活性和可扩展性。
哪些普罗米修斯最佳实践?
这是其中的三个: 1. Label carefully: Careful and consistent labeling of metrics is crucial for effective querying and alerting. Labels should be clear, concise, and include all relevant information about the metric. 2. Keep metrics simple: The metrics exposed by exporters should be simple and focus on a single aspect of the system being monitored. This helps avoid confusion and ensures that the metrics are easily understandable by all members of the team. 3. Use alerting sparingly: While alerting is a powerful feature of Prometheus, it should be used sparingly and only for the most critical issues. Setting up too many alerts can lead to alert fatigue and result in important alerts being ignored. It is recommended to set up only the most important alerts and adjust the thresholds over time based on the actual frequency of alerts.
如何在给定时间段内获得总请求?
要使用Prometheus在给定的时间内获取总请求,您可以将 * sum *函数与 * rate *函数一起使用。这是一个示例查询,将为您提供最后一个小时内的请求总数: sum(rate(http_requests_total[1h]))
在此查询中, HTTP_REQUESTS_TOTAL是跟踪HTTP请求总数的度量名称,并且速率函数计算了最后一个小时内请求的每秒率。然后,总和函数添加了所有请求,以在最后一个小时为您提供总数的总数。
您可以通过更改速率函数的持续时间来调整时间范围。例如,如果您想在最后一天获取请求总数,则可以将功能更改为评分(http_requests_total [1d]) 。
普罗米修斯的HA意味着什么?
HA代表高可用性。这意味着即使面对失败或其他问题,该系统也被设计为高度可靠并且始终可用。在实践中,这通常涉及建立多个Prometheus的实例,并确保它们都同步并能够无缝合作。这可以通过多种技术(例如负载平衡,复制和故障转移机制)来实现。通过在Prometheus中实施HA,用户可以确保其监视数据始终可用,并且即使是面对硬件或软件故障,网络问题或其他可能导致停机时间或数据丢失的问题。
您如何加入两个指标?
在Prometheus中,可以使用 * join() *函数来实现加入两个指标。 * join() *函数根据其标签值结合了两个或多个时间序列。它需要两个强制性参数: *on *and *table *。 ON参数指定要加入 * on *的标签,并且 *表 *参数指定要加入的时间序列。这是如何使用JOIN()函数加入两个指标的示例:
sum_series(
join(
on(service, instance) request_count_total,
on(service, instance) error_count_total,
)
)
在此示例中, join()函数将基于其服务和实例标签值的request_count_total和error_count_total时间序列组合。然后, sum_series()函数计算结果时间序列的总和
如何编写返回标签值的查询?
要编写一个返回Prometheus中标签值的查询,您可以使用 * label_values *函数。 * label_values *函数采用两个参数:标签的名称和指标的名称。例如,如果您的指标称为http_requests_total ,带有名为方法的标签,并且要返回方法标签的所有值,则可以使用以下查询:
label_values(http_requests_total, method)
这将返回http_requests_total metric中方法标签的所有值的列表。然后,您可以在其他查询中使用此列表或过滤数据。
如何将CPU_USER_SECONDS转换为百分比的CPU使用?
要将 * cpu_user_seconds *转换为百分比的CPU使用情况,您需要将其除以总过去的时间和CPU内核的数量,然后乘以100 100 * sum(rate(process_cpu_user_seconds_total{job="<job-name>"}[<time-period>])) by (instance) / (<time-period> * <num-cpu-cores>)
这里,是您要查询的工作的名称,是您要查询的时间范围(例如5M , 1H ),并且是您要查询的机器上的CPU内核数。
例如,为了在最后5分钟内将CPU使用百分比以4个CPU内核的计算机上运行的名为My-Job的作业,您可以使用以下查询:
100 * sum(rate(process_cpu_user_seconds_total{job="my-job"}[5m])) by (instance) / (5m * 4)
去
GO编程语言的某些特征是什么?
- 强和静态键入 - 变量的类型无法随着时间的推移而更改,并且必须在编译时间上定义它们,
- 快速
- 编译时间
- 内置并发
- 垃圾收集的
- 平台独立
- 编译到独立二进制 - 您需要运行应用程序的任何内容将被编译成一个二进制。对于运行时的版本管理非常有用。
去也有良好的社区。
var x int = 2
和x := 2
之间有什么区别?
结果是相同的,一个具有值2的变量。
使用var x int = 2
我们将变量类型设置为整数,而使用x := 2
我们让我们自己弄清楚类型。
是真是假?在Go中,我们可以重新汇总变量,一旦声明,我们就必须使用它。
错误的。我们不能重新汇总变量,但是是的,我们必须使用声明的变量。
您使用了哪些库?
这应该根据您的用法来回答,但一些示例是:
以下代码块有什么问题?如何修复它? func main() {
var x float32 = 13.5
var y int
y = x
}
以下代码块试图将整数101转换为字符串,但我们得到了“ E”。这是为什么?如何修复它? package main
import "fmt"
func main () {
var x int = 101
var y string
y = string ( x )
fmt . Println ( y )
}
它看起来是设置为101的Unicode值,并将其用于将整数转换为字符串。 y = string(x)
要获取“ y = strconv.Itoa(x)
”
以下代码有什么问题?: package main
func main() {
var x = 2
var y = 3
const someConst = x + y
}
GO中的常数只能使用常数表达式声明。但是x
, y
和它们的总和是可变的。
const initializer x + y is not a constant
以下代码块的输出将是什么? package main
import "fmt"
const (
x = iota
y = iota
)
const z = iota
func main () {
fmt . Printf ( "%v n " , x )
fmt . Printf ( "%v n " , y )
fmt . Printf ( "%v n " , z )
}
GO的IOTA标识符用于const声明中,以简化增量数字的定义。因为它可以在表达式中使用,所以它提供了超出简单枚举的一般性。
x
和y
在第一个IOTA组中,第二z
。
wiki中的iota页面
GO中使用什么_?
它避免了为返回值声明所有变量。它称为空白标识符。
回答
以下代码块的输出将是什么? package main
import "fmt"
const (
_ = iota + 3
x
)
func main () {
fmt . Printf ( "%v n " , x )
}
由于第一个IOTA被声明为3
( + 3
),因此下一个具有值4
以下代码块的输出将是什么? package main
import (
"fmt"
"sync"
"time"
)
func main () {
var wg sync. WaitGroup
wg . Add ( 1 )
go func () {
time . Sleep ( time . Second * 2 )
fmt . Println ( "1" )
wg . Done ()
}()
go func () {
fmt . Println ( "2" )
}()
wg . Wait ()
fmt . Println ( "3" )
}
输出:2 1 3
关于同步/候补组的Aritcle
Golang软件包同步
以下代码块的输出将是什么? package main
import (
"fmt"
)
func mod1 ( a [] int ) {
for i := range a {
a [ i ] = 5
}
fmt . Println ( "1:" , a )
}
func mod2 ( a [] int ) {
a = append ( a , 125 ) // !
for i := range a {
a [ i ] = 5
}
fmt . Println ( "2:" , a )
}
func main () {
s1 := [] int { 1 , 2 , 3 , 4 }
mod1 ( s1 )
fmt . Println ( "1:" , s1 )
s2 := [] int { 1 , 2 , 3 , 4 }
mod2 ( s2 )
fmt . Println ( "2:" , s2 )
}
输出:
1 [5 5 5 5]
1 [5 5 5 5]
2 [5 5 5 5 5]
2 [1 2 3 4]
在mod1
a是链接中,当我们使用a[i]
时,我们将s1
值更改为。但是在mod2
中, append
创建了新的切片,而我们仅更改a
值,而不是s2
。
关于阵列的Aritcle,关于append
的博客文章
以下代码块的输出将是什么? package main
import (
"container/heap"
"fmt"
)
// An IntHeap is a min-heap of ints.
type IntHeap [] int
func ( h IntHeap ) Len () int { return len ( h ) }
func ( h IntHeap ) Less ( i , j int ) bool { return h [ i ] < h [ j ] }
func ( h IntHeap ) Swap ( i , j int ) { h [ i ], h [ j ] = h [ j ], h [ i ] }
func ( h * IntHeap ) Push ( x interface {}) {
// Push and Pop use pointer receivers because they modify the slice's length,
// not just its contents.
* h = append ( * h , x .( int ))
}
func ( h * IntHeap ) Pop () interface {} {
old := * h
n := len ( old )
x := old [ n - 1 ]
* h = old [ 0 : n - 1 ]
return x
}
func main () {
h := & IntHeap { 4 , 8 , 3 , 6 }
heap . Init ( h )
heap . Push ( h , 7 )
fmt . Println (( * h )[ 0 ])
}
输出:3
Golang容器/堆软件包
蒙戈
MongoDB的优势是什么?还是换句话说,为什么选择MongoDB而不是NOSQL的其他实现?
MongoDB的优势如下:
SQL和NOSQL有什么区别?
主要区别在于SQL数据库是构造的(数据以行和列的表格存储(例如Excel电子表格表),而NOSQL则是非结构化的,并且数据存储可能会根据如何设置NOSQL DB的方式而变化。例如钥匙值对,面向文档等。
在哪些情况下,您希望使用nosql/mongo而不是SQL?
异质数据经常变化- 数据一致性和完整性不是当务之急
- 最好如果数据库需要迅速扩展
什么是文档?什么是集合?
文档是MongoDB中的记录,该记录存储在BSON(Binary JSON)格式中,并且是MongoDB中数据的基本单位。- 一个集合是存储在MongoDB中一个数据库中的一组相关文档。
什么是聚合器?
- 聚合器是MongoDB中的一个框架,该框架对一组数据进行操作以返回单个计算结果。
什么更好?嵌入式文件还是引用?
- 没有更好的答案,这取决于特定的用例和要求。一些解释:嵌入式文档提供原子更新,而引用文档则可以更好地归一化。
您是否在Mongo进行了数据检索优化?如果没有,您能考虑优化缓慢数据检索的方法吗?
- 在MongoDB中优化数据检索的一些方法是:索引,正确的模式设计,查询优化和数据库负载平衡。
查询
解释此查询: db.books.find({"name": /abc/})
解释此查询: db.books.find().sort({x:1})
find()和find_one()有什么区别?
find()
返回与查询条件相匹配的所有文档。- find_one()仅返回一个匹配查询条件的文档(如果找到匹配的情况,则为null)。
如何从Mongo DB导出数据?
SQL
SQL练习
姓名 | 话题 | 客观和指示 | 解决方案 | 评论 |
---|
功能与比较 | 查询改进 | 锻炼 | 解决方案 | |
SQL自我评估
什么是 SQL?
SQL(结构化查询语言)是一种关系数据库的标准语言(例如MySQL,Mariadb,...)。
它用于在关系数据库中读取,更新,删除和创建数据。
SQL与NOSQL有何不同
主要区别在于SQL数据库是构造的(数据以行和列的表格存储(例如Excel电子表格表),而NOSQL则是非结构化的,并且数据存储可能会根据如何设置NOSQL DB的方式而变化。例如钥匙值对,面向文档等。
什么时候最好使用SQL? nosql?
SQL-当数据完整性至关重要时,最好使用。由于其酸合规性,SQL通常在金融领域内的许多企业和地区实施。
NOSQL-如果您需要快速扩展内容,很棒。 NOSQL是考虑到Web应用程序的设计,因此,如果您需要快速将相同的信息传播到多个服务器
此外,由于NOSQL不使用关系数据库所需的列和行结构遵守严格的表,因此您可以将不同的数据类型存储在一起。
实用SQL-基础知识
对于这些问题,我们将使用下面显示的客户和订单表:
顾客
客户ID | 客户名称 | ittem_in_cart | cash_spent_to_date |
---|
100204 | 约翰·史密斯 | 0 | 20:00 |
100205 | 简·史密斯 | 3 | 40:00 |
100206 | 鲍比·弗兰克(Bobby Frank) | 1 | 100.20 |
订单
客户ID | 订单ID | 物品 | 价格 | date_sold |
---|
100206 | A123 | 橡皮鸭 | 2.20 | 2019-09-18 |
100206 | A123 | 泡泡浴 | 8.00 | 2019-09-18 |
100206 | Q987 | 80包TP | 90.00 | 2019-09-20 |
100205 | Z001 | 猫粮 - 金枪鱼鱼 | 10:00 | 2019-08-05 |
100205 | Z001 | 猫食 - 鸡肉 | 10:00 | 2019-08-05 |
100205 | Z001 | 猫食 - 牛肉 | 10:00 | 2019-08-05 |
100205 | Z001 | 猫食 - 小猫玉米饼 | 10:00 | 2019-08-05 |
100204 | X202 | 咖啡 | 20:00 | 2019-04-29 |
如何从该表中选择所有字段?
选择 *
来自客户;
约翰的购物车中有多少个物品?
选择项目_in_cart
来自客户
customer_name =“约翰·史密斯”;
所有客户花费的所有现金的总和是多少?
选择sum(cash_spent_to_date)为sum_cash
来自客户;
他们的购物车中有多少人有物品?
选择计数(1)为number_of_people_w_items
来自客户
其中tock_in_cart> 0;
您如何将客户表加入订单表?
您将加入他们的唯一钥匙。在这种情况下,唯一的密钥是客户表和订单表中的customer_id
您将如何显示哪个客户订购哪些项目?
选择c.customer_name,o.item
来自客户c
左加入订单o
在c.customer_id上= o.customer_id;
您将如何显示谁订购猫食品以及花费的总金额?
用cat_food as(
选择customer_id,总和(价格)为total_price
从订单
诸如“%猫食品%”之类的物品
customer_id的组
)
选择customer_name,total_price
来自客户c
内部加入cat_food f
在c.customer_id = f.customer_id上
c.customer_id in(cat_food中的customer_id);
尽管这是一个简单的陈述,但“带有”子句的“真正的查询”在加入另一个之前需要在表格上运行时确实会发光。用语句很好,因为在运行查询时可以创建一个伪温度,而不是创建一个全新的表。
所有购买猫食品的总和不容易获得,因此我们使用了一个带有声明的伪造表来创建伪表来检索每个客户所花费的价格的总和,然后正常加入表。
您将使用以下哪些查询? SELECT count(*) SELECT count(*)
FROM shawarma_purchases FROM shawarma_purchases
WHERE vs. WHERE
YEAR(purchased_at) == '2017' purchased_at >= '2017-01-01' AND
purchased_at <= '2017-31-12'
SELECT count(*) FROM shawarma_purchases WHERE purchased_at >= '2017-01-01' AND purchased_at <= '2017-31-12'
当您使用函数( YEAR(purchased_at)
)时,它必须扫描整个数据库,而不是使用索引,基本上是按其自然状态按原样的。
开放堆栈
您熟悉OpenStack的哪些组件/项目?
您能告诉我以下每个服务/项目负责什么?
Nova-管理虚拟实例- 中子 - 通过提供网络作为服务(NAAS)来管理网络
- 煤渣 - 块存储
- 目光 - 管理虚拟机和容器的图像(搜索,获取和注册)
- Keystone-跨云的身份验证服务
确定用于以下每种服务的服务/项目:- 复制或快照实例
- GUI用于查看和修改资源
- 块存储
- 管理虚拟实例
一眼 - 图像服务。还用于复制或快照实例- 地平线 - 用于查看和修改资源的GUI
- 煤渣 - 块存储
- Nova-管理虚拟实例
什么是租户/项目?
确定真或错误:- OpenStack免费使用
- 负责网络的服务是一眼
- 租户/项目的目的是在不同项目和OpenStack的用户之间共享资源
详细描述如何使用浮动IP提出实例
您接到客户的电话,说:“我可以ping我的实例,但不能连接(SSH)”。可能是什么问题?
哪些类型的网络OpenStack支持?
您如何调试OpenStack存储问题? (工具,日志,...)
您如何调试OpenStack计算问题? (工具,日志,...)
OpenStack部署和Tripleo
您过去是否部署过OpenStack?如果是,您能描述您是如何做到的吗?
您熟悉Tripleo吗?它与Devstack或Packstack有何不同?
您可以在这里阅读有关Tripleo的信息
OpenStack Compute
您能详细描述Nova吗?
用于提供和管理虚拟实例- 它支持不同级别的多租户 - 记录,最终用户控制,审计等。
- 高度可扩展
- 可以使用内部系统或LDAP进行身份验证
- 支持多种类型的块存储
- 试图成为硬件和管理程序不可知论者
您对Nova架构和组件有什么了解?
NOVA -API-服务元数据并计算API的服务器- 不同的NOVA组件通过使用队列(通常是兔子)和数据库进行通信
- NOVA-SCHEDULER检查了创建实例的请求,该请求确定将在何处创建实例并运行实例
- NOVA-COMPUTE是负责与管理程序进行交流以创建实例并管理其生命周期的组件
OpenStack网络(中子)
详细说明中子
OpenStack和独立项目的核心组成部分之一- 中子专注于交付网络作为服务
- 使用中子,用户可以在云中设置网络并配置和管理各种网络服务
- 中子与:
- Nova -Nova与Neva通信将NIC插入网络
- 地平线 - 支持仪表板中的网络实体,还提供拓扑视图,其中包括网络详细信息
解释以下每个组件:- 中子DHCP代理
- 中子L3代理
- 中子计算机
- 中子 - * - 敏捷
- 中子服务器
中子L3代理-L3/NAT转发(例如,为VM提供外部网络访问)- 中子DHCP代理-DHCP服务
- 中子计算机-L3交通计量
- 中子 - * - agtent-管理每个计算上的本地vswitch配置(基于选择的插件)
- 中子服务器 - 如果需要,请公开网络API并将请求传递给其他插件
解释这些网络类型:
管理网络 - 用于OpenStack组件之间的内部通信。该网络中的任何IP地址仅在数据列中访问- 来宾网络 - 用于实例/VM之间的通信
- API网络 - 用于服务API通信。该网络中的任何IP地址均可公开访问
- 外部网络 - 用于公共通信。 Internet上的任何人都可以访问该网络中的任何IP地址
您应在哪个顺序中删除以下实体:
原因有很多。例如:如果分配了活动端口,则无法删除路由器。
什么是提供商网络?
L2和L3存在哪些组件和服务?
什么是ML2插件?解释其架构
什么是L2代理?它是如何工作的,什么是负责?
什么是L3代理?它是如何工作的,什么是负责?
解释元数据代理负责
哪些网络实体中子支持?
您如何调试OpenStack网络问题? (工具,日志,...)
OpenStack-一眼
详细说明
一眼是OpenStack Image服务- 它处理与实例磁盘和图像有关的请求
- 一眼也用于创建快速实例备份的快照
- 用户可以使用眼神创建新图像或上传现有图像
描述Glance架构
GLANCE -API-负责处理图像API调用,例如检索和存储。它由两个API组成:1。注册表-API-负责内部请求2。用户API-可以公开访问- Glance -Registry-负责处理图像元数据请求(例如大小,类型等)。该组件是私人的,这意味着它不公开可用
- 元数据定义服务 - 自定义元数据的API
- 数据库 - 用于存储图像元数据
- 图像存储库 - 用于存储图像。这可以是文件系统,Swift对象存储,HTTP等。
OpenStack- Swift
详细解释Swift
Swift是对象商店服务,是一家高度可用,分布和一致的商店,旨在存储大量数据- Swift将数据写入多个磁盘时,将数据分发到多个服务器
- 可以选择添加其他服务器以扩展群集。在迅速保持信息和数据复制的完整性的同时。
默认情况下,用户可以存储100GB的对象吗?
默认情况下不是。对象存储API将最大值限制为每个对象的5GB,但可以进行调整。
关于Swift说明以下内容:
Container - Defines a namespace for objects.- Account - Defines a namespace for containers
- Object - Data content (eg image, document, ...)
是真是假? there can be two objects with the same name in the same container but not in two different containers
错误的。 Two objects can have the same name if they are in different containers.
OpenStack - Cinder
Explain Cinder in detail
Cinder is OpenStack Block Storage service- It basically provides used with storage resources they can consume with other services such as Nova
- One of the most used implementations of storage supported by Cinder is LVM
- From user perspective this is transparent which means the user doesn't know where, behind the scenes, the storage is located or what type of storage is used
Describe Cinder's components
cinder-api - receives API requests- cinder-volume - manages attached block devices
- cinder-scheduler - responsible for storing volumes
OpenStack - Keystone
Can you describe the following concepts in regards to Keystone?
Role - A list of rights and privileges determining what a user or a project can perform- Tenant/Project - Logical representation of a group of resources isolated from other groups of resources. It can be an account, organization, ...
- Service - An endpoint which the user can use for accessing different resources
- Endpoint - a network address which can be used to access a certain OpenStack service
- Token - Used for access resources while describing which resources can be accessed by using a scope
What are the properties of a service? In other words, how a service is identified?
使用:
Explain the following: - PublicURL - InternalURL - AdminURL
PublicURL - Publicly accessible through public internet- InternalURL - Used for communication between services
- AdminURL - Used for administrative management
什么是服务目录?
A list of services and their endpoints
OpenStack Advanced - Services
Describe each of the following services
Swift - highly available, distributed, eventually consistent object/blob store- Sahara - Manage Hadoop Clusters
- Ironic - Bare Metal Provisioning
- Trove - Database as a service that runs on OpenStack
- Aodh - Alarms Service
- Ceilometer - Track and monitor usage
Identify the service/project used for each of the following:- Database as a service which runs on OpenStack
- Bare Metal Provisioning
- Track and monitor usage
- Alarms Service
- Manage Hadoop Clusters
- highly available, distributed, eventually consistent object/blob store
Database as a service which runs on OpenStack - Trove- Bare Metal Provisioning - Ironic
- Track and monitor usage - Ceilometer
- Alarms Service - Aodh
- Manage Hadoop Clusters
- Manage Hadoop Clusters - Sahara
- highly available, distributed, eventually consistent object/blob store - Swift
OpenStack Advanced - Keystone
Can you describe Keystone service in detail?
You can't have OpenStack deployed without Keystone- It Provides identity, policy and token services
- The authentication provided is for both users and services
- The authorization supported is token-based and user-based.
- There is a policy defined based on RBAC stored in a JSON file and each line in that file defines the level of access to apply
Describe Keystone architecture
There is a service API and admin API through which Keystone gets requests- Keystone has four backends:
- Token Backend - Temporary Tokens for users and services
- Policy Backend - Rules management and authorization
- Identity Backend - users and groups (either standalone DB, LDAP, ...)
- Catalog Backend - Endpoints
- It has pluggable environment where you can integrate with:
- KVS (Key Value Store)
- SQL
- 聚丙烯酰胺
- 内存缓存
Describe the Keystone authentication process
Keystone gets a call/request and checks whether it's from an authorized user, using username, password and authURL- Once confirmed, Keystone provides a token.
- A token contains a list of user's projects so there is no to authenticate every time and a token can submitted instead
OpenStack Advanced - Compute (Nova)
What each of the following does?:- nova-api
- nova 计算
- nova-conductor
- nova 证书
- nova-consoleauth
- nova-scheduler
nova-api - responsible for managing requests/calls- nova-compute - responsible for managing instance lifecycle
- nova-conductor - Mediates between nova-compute and the database so nova-compute doesn't access it directly
What types of Nova proxies are you familiar with?
Nova-novncproxy - Access through VNC connections- Nova-spicehtml5proxy - Access through SPICE
- Nova-xvpvncproxy - Access through a VNC connection
OpenStack Advanced - Networking (Neutron)
Explain BGP dynamic routing
What is the role of network namespaces in OpenStack?
OpenStack Advanced - Horizon
Can you describe Horizon in detail?
Django-based project focusing on providing an OpenStack dashboard and the ability to create additional customized dashboards- You can use it to access the different OpenStack services resources - instances, images, networks, ...
- By accessing the dashboard, users can use it to list, create, remove and modify the different resources
- It's also highly customizable and you can modify or add to it based on your needs
What can you tell about Horizon architecture?
API is backward compatible- There are three type of dashboards: user, system and settings
- It provides core support for all OpenStack core projects such as Neutron, Nova, etc. (out of the box, no need to install extra packages or plugins)
- Anyone can extend the dashboards and add new components
- Horizon provides templates and core classes from which one can build its own dashboard
木偶
What is Puppet?它是如何运作的?
- Puppet is a configuration management tool ensuring that all systems are configured to a desired and predictable state.
Explain Puppet architecture
- Puppet has a primary-secondary node architecture. The clients are distributed across the network and communicate with the primary-secondary environment where Puppet modules are present. The client agent sends a certificate with its ID to the server; the server then signs that certificate and sends it back to the client. This authentication allows for secure and verifiable communication between the client and the master.
Can you compare Puppet to other configuration management tools? Why did you chose to use Puppet?
- Puppet is often compared to other configuration management tools like Chef, Ansible, SaltStack, and cfengine. The choice to use Puppet often depends on an organization's needs, such as ease of use, scalability, and community support.
解释以下内容:
Modules - are a collection of manifests, templates, and files- Manifests - are the actual codes for configuring the clients
- Node - allows you to assign specific configurations to specific nodes
Explain Facter
- Facter is a standalone tool in Puppet that collects information about a system and its configuration, such as the operating system, IP addresses, memory, and network interfaces. This information can be used in Puppet manifests to make decisions about how resources should be managed, and to customize the behavior of Puppet based on the characteristics of the system. Facter is integrated into Puppet, and its facts can be used within Puppet manifests to make decisions about resource management.
What is MCollective?
- MCollective is a middleware system that integrates with Puppet to provide orchestration, remote execution, and parallel job execution capabilities.
Do you have experience with writing modules? Which module have you created and for what?
Explain what is Hiera
- Hiera is a hierarchical data store in Puppet that is used to separate data from code, allowing data to be more easily separated, managed, and reused.
松紧带
什么是弹性堆栈?
The Elastic Stack consists of:
- 弹性搜索
- 木花
- 日志存储
- 节拍
- Elastic Hadoop
- APM服务器
Elasticsearch, Logstash and Kibana are also known as the ELK stack.
Explain what is Elasticsearch
来自官方文档:
"Elasticsearch is a distributed document store. Instead of storing information as rows of columnar data, Elasticsearch stores complex data structures that have been serialized as JSON documents"
What is Logstash?
来自博客:
"Logstash is a powerful, flexible pipeline that collects, enriches and transports data. It works as an extract, transform & load (ETL) tool for collecting log messages."
Explain what beats are
Beats are lightweight data shippers. These data shippers installed on the client where the data resides. Examples of beats: Filebeat, Metricbeat, Auditbeat. There are much more.
什么是基巴纳?
来自官方文档:
"Kibana is an open source analytics and visualization platform designed to work with Elasticsearch. You use Kibana to search, view, and interact with data stored in Elasticsearch indices. You can easily perform advanced data analysis and visualize your data in a variety of charts, tables, and maps."
Describe what happens from the moment an app logged some information until it's displayed to the user in a dashboard when the Elastic stack is used
The process may vary based on the chosen architecture and the processing you may want to apply to the logs. One possible workflow is:
The data logged by the application is picked by filebeat and sent to logstash- Logstash process the log based on the defined filters. Once done, the output is sent to Elasticsearch
- Elasticsearch stores the document it got and the document is indexed for quick future access
- The user creates visualizations in Kibana which based on the indexed data
- The user creates a dashboard which composed out of the visualization created in the previous step
弹性搜索
What is a data node?
This is where data is stored and also where different processing takes place (eg when you search for a data).
What is a master node?
Part of a master node responsibilities:
- Track the status of all the nodes in the cluster
- Verify replicas are working and the data is available from every data node.
- No hot nodes (no data node that works much harder than other nodes)
While there can be multiple master nodes in reality only of them is the elected master node.
What is an ingest node?
A node which responsible for processing the data according to ingest pipeline. In case you don't need to use logstash then this node can receive data from beats and process it, similarly to how it can be processed in Logstash.
What is Coordinating only node?
来自官方文档:
Coordinating only nodes can benefit large clusters by offloading the coordinating node role from data and master-eligible nodes. They join the cluster and receive the full cluster state, like every other node, and they use the cluster state to route requests directly to the appropriate place(s).
How data is stored in Elasticsearch?
Data is stored in an index- The index is spread across the cluster using shards
What is an Index?
Index in Elasticsearch is in most cases compared to a whole database from the SQL/NoSQL world.
You can choose to have one index to hold all the data of your app or have multiple indices where each index holds different type of your app (eg index for each service your app is running).
The official docs also offer a great explanation (in general, it's really good documentation, as every project should have):
"An index can be thought of as an optimized collection of documents and each document is a collection of fields, which are the key-value pairs that contain your data"
Explain Shards
An index is split into shards and documents are hashed to a particular shard. Each shard may be on a different node in a cluster and each one of the shards is a self contained index.
This allows Elasticsearch to scale to an entire cluster of servers.
What is an Inverted Index?
来自官方文档:
"An inverted index lists every unique word that appears in any document and identifies all of the documents each word occurs in."
What is a Document?
Continuing with the comparison to SQL/NoSQL a Document in Elasticsearch is a row in table in the case of SQL or a document in a collection in the case of NoSQL. As in NoSQL a document is a JSON object which holds data on a unit in your app. What is this unit depends on the your app. If your app related to book then each document describes a book. If you are app is about shirts then each document is a shirt.
You check the health of your elasticsearch cluster and it's red.这是什么意思? What can cause the status to be yellow instead of green?
Red means some data is unavailable in your cluster. Some shards of your indices are unassigned. There are some other states for the cluster. Yellow means that you have unassigned shards in the cluster. You can be in this state if you have single node and your indices have replicas. Green means that all shards in the cluster are assigned to nodes and your cluster is healthy.
是真是假? Elasticsearch indexes all data in every field and each indexed field has the same data structure for unified and quick query ability
错误的。来自官方文档:
"Each indexed field has a dedicated, optimized data structure. For example, text fields are stored in inverted indices, and numeric and geo fields are stored in BKD trees."
What reserved fields a document has?
Explain Mapping
What are the advantages of defining your own mapping? (or: when would you use your own mapping?)
You can optimize fields for partial matching- You can define custom formats of known fields (eg date)
- You can perform language-specific analysis
Explain Replicas
In a network/cloud environment where failures can be expected any time, it is very useful and highly recommended to have a failover mechanism in case a shard/node somehow goes offline or disappears for whatever reason. To this end, Elasticsearch allows you to make one or more copies of your index's shards into what are called replica shards, or replicas for short.
Can you explain Term Frequency & Document Frequency?
Term Frequency is how often a term appears in a given document and Document Frequency is how often a term appears in all documents. They both are used for determining the relevance of a term by calculating Term Frequency / Document Frequency.
You check "Current Phase" under "Index lifecycle management" and you see it's set to "hot".这是什么意思?
"The index is actively being written to". More about the phases here
What this command does? curl -X PUT "localhost:9200/customer/_doc/1?pretty" -H 'Content-Type: application/json' -d'{ "name": "John Doe" }'
It creates customer index if it doesn't exists and adds a new document with the field name which is set to "John Dow". Also, if it's the first document it will get the ID 1.
What will happen if you run the previous command twice? What about running it 100 times?
If name value was different then it would update "name" to the new value- In any case, it bumps version field by one
What is the Bulk API?你会用它做什么?
Bulk API is used when you need to index multiple documents. For high number of documents it would be significantly faster to use rather than individual requests since there are less network roundtrips.
查询DSL
Explain Elasticsearch query syntax (Booleans, Fields, Ranges)
Explain what is Relevance Score
Explain Query Context and Filter Context
来自官方文档:
"In the query context, a query clause answers the question “How well does this document match this query clause?” Besides deciding whether or not the document matches, the query clause also calculates a relevance score in the _score meta-field."
"In a filter context, a query clause answers the question “Does this document match this query clause?” The answer is a simple Yes or No — no scores are calculated. Filter context is mostly used for filtering structured data"
Describe how would an architecture of production environment with large amounts of data would be different from a small-scale environment
There are several possible answers for this question. One of them is as follows:
A small-scale architecture of elastic will consist of the elastic stack as it is. This means we will have beats, logstash, elastcsearch and kibana.
A production environment with large amounts of data can include some kind of buffering component (eg Reddis or RabbitMQ) and also security component such as Nginx.
日志存储
What are Logstash plugins? What plugins types are there?
Input Plugins - how to collect data from different sources- Filter Plugins - processing data
- Output Plugins - push data to different outputs/services/platforms
What is grok?
A logstash plugin which modifies information in one format and immerse it in another.
How grok works?
What grok patterns are you familiar with?
What is `_grokparsefailure?`
How do you test or debug grok patterns?
What are Logstash Codecs? What codecs are there?
木花
What can you find under "Discover" in Kibana?
The raw data as it is stored in the index. You can search and filter it.
You see in Kibana, after clicking on Discover, "561 hits".这是什么意思?
Total number of documents matching the search results. If not query used then simply the total number of documents.
What can you find under "Visualize"?
"Visualize" is where you can create visual representations for your data (pie charts, graphs, ...)
What visualization types are supported/included in Kibana?
What visualization type would you use for statistical outliers
Describe in detail how do you create a dashboard in Kibana
文件节拍
What is Filebeat?
Filebeat is used to monitor the logging directories inside of VMs or mounted as a sidecar if exporting logs from containers, and then forward these logs onward for further processing, usually to logstash.
If one is using ELK, is it a must to also use filebeat? In what scenarios it's useful to use filebeat?
Filebeat is a typical component of the ELK stack, since it was developed by Elastic to work with the other products (Logstash and Kibana). It's possible to send logs directly to logstash, though this often requires coding changes for the application. Particularly for legacy applications with little test coverage, it might be a better option to use filebeat, since you don't need to make any changes to the application code.
什么是收割机?
在这里阅读
是真是假? a single harvester harvest multiple files, according to the limits set in filebeat.yml
错误的。 One harvester harvests one file.
What are filebeat modules?
These are pre-configured modules for specific types of logging locations (eg, Traefik, Fargate, HAProxy) to make it easy to configure forwarding logs using filebeat. They have different configurations based on where you're collecting logs from.
弹性堆栈
How do you secure an Elastic Stack?
You can generate certificates with the provided elastic utils and change configuration to enable security using certificates model.
分布式
Explain Distributed Computing (or Distributed System)
According to Martin Kleppmann:
"Many processes running on many machines...only message-passing via an unreliable network with variable delays, and the system may suffer from partial failures, unreliable clocks, and process pauses."
Another definition: "Systems that are physically separated, but logically connected"
What can cause a system to fail?
Do you know what is "CAP theorem"? (aka as Brewer's theorem)
According to the CAP theorem, it's not possible for a distributed data store to provide more than two of the following at the same time:
Availability: Every request receives a response (it doesn't has to be the most recent data)- Consistency: Every request receives a response with the latest/most recent data
- Partition tolerance: Even if some the data is lost/dropped, the system keeps running
What are the problems with the following design?如何改进呢?
1. The transition can take time. In other words, noticeable downtime. 2. Standby server is a waste of resources - if first application server is running then the standby does nothing What are the problems with the following design?如何改进呢?
Issues: If load balancer dies , we lose the ability to communicate with the application.改善方法:
Add another load balancer- Use DNS A record for both load balancers
- Use message queue
What is "Shared-Nothing" architecture?
It's an architecture in which data is and retrieved from a single, non-shared, source usually exclusively connected to one node as opposed to architectures where the request can get to one of many nodes and the data will be retrieved from one shared location (storage , 记忆, ...)。
Explain the Sidecar Pattern (Or sidecar proxy)
杂项
姓名 | 话题 | Objective & Instructions | 解决方案 | 评论 |
---|
Highly Available "Hello World" | 锻炼 | 解决方案 | | |
What happens when you type in a URL in an address bar in a browser?
- The browser searches for the record of the domain name IP address in the DNS in the following order:
- Browser cache
- Operating system cache
- The DNS server configured on the user's system (can be ISP DNS, public DNS, ...)
- If it couldn't find a DNS record locally, a full DNS resolution is started.
- It connects to the server using the TCP protocol
- The browser sends an HTTP request to the server
- The server sends an HTTP response back to the browser
- The browser renders the response (eg HTML)
- The browser then sends subsequent requests as needed to the server to get the embedded links, javascript, images in the HTML and then steps 3 to 5 are repeated.
TODO: add more details!
应用程序编程接口
Explain what is an API
I like this definition from blog.christianposta.com:
"An explicitly and purposefully defined interface designed to be invoked over a network that enables software developers to get programmatic access to data and functionality within an organization in a controlled and comfortable way."
What is an API specification?
From swagger.io:
"An API specification provides a broad understanding of how an API behaves and how the API links with other APIs. It explains how the API functions and the results to expect when using the API"
是真是假? API Definition is the same as API Specification
错误的。 From swagger.io:
"An API definition is similar to an API specification in that it provides an understanding of how an API is organized and how the API functions. But the API definition is aimed at machine consumption instead of human consumption of APIs."
What is an API gateway?
An API gateway is like the gatekeeper that controls how different parts talk to each other and how information is exchanged between them.
The API gateway provides a single point of entry for all clients, and it can perform several tasks, including routing requests to the appropriate backend service, load balancing, security and authentication, rate limiting, caching, and monitoring.
By using an API gateway, organizations can simplify the management of their APIs, ensure consistent security and governance, and improve the performance and scalability of their backend services. They are also commonly used in microservices architectures, where there are many small, independent services that need to be accessed by different clients.
What are the advantages of using/implementing an API gateway?
优点:
- Simplifies API management: Provides a single entry point for all requests, which simplifies the management and monitoring of multiple APIs.
- Improves security: Able to implement security features like authentication, authorization, and encryption to protect the backend services from unauthorized access.
- Enhances scalability: Can handle traffic spikes and distribute requests to backend services in a way that maximizes resource utilization and improves overall system performance.
- Enables service composition: Can combine different backend services into a single API, providing more granular control over the services that clients can access.
- Facilitates integration with external systems: Can be used to expose internal services to external partners or customers, making it easier to integrate with external systems and enabling new business models.
What is a Payload in API?
什么是自动化? How it's related or different from Orchestration?
Automation is the act of automating tasks to reduce human intervention or interaction in regards to IT technology and systems.
While automation focuses on a task level, Orchestration is the process of automating processes and/or workflows which consists of multiple tasks that usually across multiple systems.
Tell me about interesting bugs you've found and also fixed
What is a Debugger and how it works?
What services an application might have?
What is Metadata?
Data about data. Basically, it describes the type of information that an underlying data will hold.
You can use one of the following formats: JSON, YAML, XML.您会使用哪一个?为什么?
I can't answer this for you :)
What's KPI?
What's OKR?
What's DSL (Domain Specific Language)?
Domain Specific Language (DSLs) are used to create a customised language that represents the domain such that domain experts can easily interpret it.
What's the difference between KPI and OKR?
YAML
什么是 YAML?
Data serialization language used by many technologies today like Kubernetes, Ansible, etc.
是真是假? Any valid JSON file is also a valid YAML file
真的。 Because YAML is superset of JSON.
What is the format of the following data? {
applications: [
{
name: "my_app",
language: "python",
version: 20.17
}
]
}
JSON What is the format of the following data? applications:
- app: "my_app"
language: "python"
version: 20.17
YAML How to write a multi-line string with YAML? What use cases is it good for?
someMultiLineString: | look mama I can write a multi-line string I love YAML
It's good for use cases like writing a shell script where each line of the script is a different command.
What is the difference between someMultiLineString: |
to someMultiLineString: >
?
using >
will make the multi-line string to fold into a single line
someMultiLineString: >
This is actually
a single line
do not let appearances fool you
What are placeholders in YAML?
They allow you reference values instead of directly writing them and it is used like this:
username: {{ my.user_name }}
How can you define multiple YAML components in one file?
Using this: ---
For Examples:
document_number: 1
---
document_number: 2
固件
Explain what is a firmware
Wikipedia: "In computing, firmware is a specific class of computer software that provides the low-level control for a device's specific hardware. Firmware, such as the BIOS of a personal computer, may contain basic functions of a device, and may provide hardware abstraction services to higher-level software such as operating systems."
卡桑德拉
When running a cassandra cluster, how often do you need to run nodetool repair in order to keep the cluster consistent?- Within the columnFamily GC-grace Once a week
- Less than the compacted partition minimum bytes
- Depended on the compaction strategy
HTTP协议
什么是 HTTP?
Avinetworks: HTTP stands for Hypertext Transfer Protocol. HTTP uses TCP port 80 to enable internet communication. It is part of the Application Layer (L7) in OSI Model.
Describe HTTP request lifecycle
Resolve host by request to DNS resolver- Client SYN
- Server SYN+ACK
- Client SYN
- HTTP请求
- HTTP响应
是真是假? HTTP is stateful
错误的。 It doesn't maintain state for incoming request.
How HTTP request looks like?
它包括:
Request line - request type- Headers - content info like length, encoding, etc.
- Body (not always included)
What HTTP method types are there?
What HTTP response codes are there?
1xx - informational- 2xx - Success
- 3xx - Redirect
- 4xx - Error, client fault
- 5xx - Error, server fault
什么是 HTTPS?
HTTPS is a secure version of the HTTP protocol used to transfer data between a web browser and a web server. It encrypts the communication using SSL/TLS encryption to ensure that the data is private and secure.
Learn more: https://www.cloudflare.com/learning/ssl/why-is-http-not-secure/
Explain HTTP Cookies
HTTP is stateless. To share state, we can use Cookies.
TODO: explain what is actually a Cookie
What is HTTP Pipelining?
You get "504 Gateway Timeout" error from an HTTP server.这是什么意思?
The server didn't receive a response from another server it communicates with in a timely manner.
什么是代理?
A proxy is a server that acts as a middleman between a client device and a destination server. It can help improve privacy, security, and performance by hiding the client's IP address, filtering content, and caching frequently accessed data.
- Proxies can be used for load balancing, distributing traffic across multiple servers to help prevent server overload and improve website or application performance. They can also be used for data analysis, as they can log requests and traffic, providing useful insights into user behavior and preferences.
什么是反向代理?
A reverse proxy is a type of proxy server that sits between a client and a server, but it is used to manage traffic going in the opposite direction of a traditional forward proxy. In a forward proxy, the client sends requests to the proxy server, which then forwards them to the destination server. However, in a reverse proxy, the client sends requests to the destination server, but the requests are intercepted by the reverse proxy before they reach the server.
- They're commonly used to improve web server performance, provide high availability and fault tolerance, and enhance security by preventing direct access to the back-end server. They are often used in large-scale web applications and high-traffic websites to manage and distribute requests to multiple servers, resulting in improved scalability and reliability.
When you publish a project, you usually publish it with a license. What types of licenses are you familiar with and which one do you prefer to use?
Explain what is "X-Forwarded-For"
Wikipedia: "The X-Forwarded-For (XFF) HTTP header field is a common method for identifying the originating IP address of a client connecting to a web server through an HTTP proxy or load balancer."
负载均衡器
什么是负载均衡器?
A load balancer accepts (or denies) incoming network traffic from a client, and based on some criteria (application related, network, etc.) it distributes those communications out to servers (at least one).
Why to used a load balancer?
Scalability - using a load balancer, you can possibly add more servers in the backend to handle more requests/traffic from the clients, as opposed to using one server.- Redundancy - if one server in the backend dies, the load balancer will keep forwarding the traffic/requests to the second server so users won't even notice one of the servers in the backend is down.
What load balancer techniques/algorithms are you familiar with?
循环赛- Weighted Round Robin
- 最少连接
- 加权最少连接
- Resource Based
- Fixed Weighting
- Weighted Response Time
- Source IP Hash
- URL Hash
What are the drawbacks of round robin algorithm in load balancing?
A simple round robin algorithm knows nothing about the load and the spec of each server it forwards the requests to. It is possible, that multiple heavy workloads requests will get to the same server while other servers will got only lightweight requests which will result in one server doing most of the work, maybe even crashing at some point because it unable to handle all the heavy workloads requests by its own.- Each request from the client creates a whole new session. This might be a problem for certain scenarios where you would like to perform multiple operations where the server has to know about the result of operation so basically, being sort of aware of the history it has with the client. In round robin, first request might hit server X, while second request might hit server Y and ask to continue processing the data that was processed on server X already.
什么是应用程序负载均衡器?
In which scenarios would you use ALB?
At what layers a load balancer can operate?
L4 和 L7
Can you perform load balancing without using a dedicated load balancer instance?
Yes, you can use DNS for performing load balancing.
What is DNS load balancing? What its advantages?你什么时候会使用它?
Load Balancers - Sticky Sessions
What are sticky sessions?他们的优点和缺点是什么?
Recommended read:
缺点:
Can cause uneven load on instance (since requests routed to the same instances) Pros:- Ensures in-proc sessions are not lost when a new request is created
Name one use case for using sticky sessions
You would like to make sure the user doesn't lose the current session data.
What sticky sessions use for enabling the "stickiness"?
曲奇饼。 There are application based cookies and duration based cookies.
Explain application-based cookies
Generated by the application and/or the load balancer- Usually allows to include custom data
Explain duration-based cookies
Generated by the load balancer- Session is not sticky anymore once the duration elapsed
Load Balancers - Load Balancing Algorithms
Explain each of the following load balancing techniques- 循环赛
- Weighted Round Robin
- 最少连接
- 加权最少连接
- Resource Based
- Fixed Weighting
- Weighted Response Time
- Source IP Hash
- URL Hash
Explain use case for connection draining?
To ensure that a Classic Load Balancer stops sending requests to instances that are de-registering or unhealthy, while keeping the existing connections open, use connection draining. This enables the load balancer to complete in-flight requests made to instances that are de-registering or unhealthy. The maximum timeout value can be set between 1 and 3,600 seconds on both GCP and AWS.
许可证
Are you familiar with "Creative Commons"?你对此了解多少?
The Creative Commons license is a set of copyright licenses that allow creators to share their work with the public while retaining some control over how it can be used. The license was developed as a response to the restrictive standards of traditional copyright laws, which limited access of creative works. Its creators to choose the terms under which their works can be shared, distributed, and used by others. They're six main types of Creative Commons licenses, each with different levels of restrictions and permissions, the six licenses are:
- Attribution (CC BY): Allows others to distribute, remix, and build upon the work, even commercially, as long as they credit the original creator.
- Attribution-ShareAlike (CC BY-SA): Allows others to remix and build upon the work, even commercially, as long as they credit the original creator and release any new creations under the same license.
- Attribution-NoDerivs (CC BY-ND): Allows others to distribute the work, even commercially, but they cannot remix or change it in any way and must credit the original creator.
- Attribution-NonCommercial (CC BY-NC): Allows others to remix and build upon the work, but they cannot use it commercially and must credit the original creator.
- Attribution-NonCommercial-ShareAlike (CC BY-NC-SA): Allows others to remix and build upon the work, but they cannot use it commercially, must credit the original creator, and must release any new creations under the same license.
- Attribution-NonCommercial-NoDerivs (CC BY-NC-ND): Allows others to download and share the work, but they cannot use it commercially, remix or change it in any way, and must credit the original creator.
Simply stated, the Creative Commons licenses are a way for creators to share their work with the public while retaining some control over how it can be used. The licenses promote creativity, innovation, and collaboration, while also respecting the rights of creators while still encouraging the responsible use of creative works.
More information: https://creativecommons.org/licenses/
Explain the differences between copyleft and permissive licenses
In Copyleft, any derivative work must use the same licensing while in permissive licensing there are no such condition. GPL-3 is an example of copyleft license while BSD is an example of permissive license.
随机的
How a search engine works?
How auto completion works?
What is faster than RAM?
CPU cache.来源
What is a memory leak?
A memory leak is a programming error that occurs when a program fails to release memory that is no longer needed, causing the program to consume increasing amounts of memory over time.
The leaks can lead to a variety of problems, including system crashes, performance degradation, and instability. Usually occurring after failed maintenance on older systems and compatibility with new components over time.
What is your favorite protocol?
SSH HTTP DHCP DNS ...
What is Cache API?
What is the C10K problem?今天是否相关?
https://idiallo.com/blog/c10k-2016
贮存
What types of storage are there?
Explain Object Storage
Data is divided to self-contained objects- Objects can contain metadata
What are the pros and cons of object storage?
优点:
Usually with object storage, you pay for what you use as opposed to other storage types where you pay for the storage space you allocate- Scalable storage: Object storage mostly based on a model where what you use, is what you get and you can add storage as need Cons:
- Usually performs slower than other types of storage
- No granular modification: to change an object, you have re-create it
What are some use cases for using object storage?
Explain File Storage
File Storage used for storing data in files, in a hierarchical structure- Some of the devices for file storage: hard drive, flash drive, cloud-based file storage
- Files usually organized in directories
What are the pros and cons of File Storage?
优点:
Users have full control of their own files and can run variety of operations on the files: delete, read, write and move.- Security mechanism allows for users to have a better control at things such as file locking
What are some examples of file storage?
Local filesystem Dropbox Google Drive
What types of storage devices are there?
Explain IOPS
Explain storage throughput
What is a filesystem?
A file system is a way for computers and other electronic devices to organize and store data files. It provides a structure that helps to organize data into files and directories, making it easier to find and manage information. A file system is crucial for providing a way to store and manage data in an organized manner.
Commonly used filed systems: Windows:
苹果电脑操作系统:
Explain Dark Data
Explain MBR
Questions you CAN ask
A list of questions you as a candidate can ask the interviewer during or after the interview. These are only a suggestion, use them carefully. Not every interviewer will be able to answer these (or happy to) which should be perhaps a red flag warning for your regarding working in such place but that's really up to you.
你喜欢在这里工作什么?
How does the company promote personal growth?
What is the current level of technical debt you are dealing with?
Be careful when asking this question - all companies, regardless of size, have some level of tech debt. Phrase the question in the light that all companies have the deal with this, but you want to see the current pain points they are dealing with
This is a great way to figure how managers deal with unplanned work, and how good they are at setting expectations with projects.
Why I should NOT join you? (or 'what you don't like about working here?')
What was your favorite project you've worked on?
This can give you insights in some of the cool projects a company is working on, and if you would enjoy working on projects like these. This is also a good way to see if the managers are allowing employees to learn and grow with projects outside of the normal work you'd do.
If you could change one thing about your day to day, what would it be?
Similar to the tech debt question, this helps you identify any pain points with the company. Additionally, it can be a great way to show how you'd be an asset to the team.
For Example, if they mention they have problem X, and you've solved that in the past, you can show how you'd be able to mitigate that problem.
Let's say that we agree and you hire me to this position, after X months, what do you expect that I have achieved?
Not only this will tell you what is expected from you, it will also provide big hint on the type of work you are going to do in the first months of your job.
测试
Explain white-box testing
Explain black-box testing
What are unit tests?
Unit test are a software testing technique that involves systimatically breaking down a system and testing each individual part of the assembly. These tests are automated and can be run repeatedly to allow developers to catch edge case scenarios or bugs quickly while developing.
The main objective of unit tests are to verify each function is producing proper outputs given a set of inputs.
What types of tests would you run to test a web application?
Explain test harness?
What is A/B testing?
What is network simulation and how do you perform it?
What types of performances tests are you familiar with?
Explain the following types of tests:- Load Testing
- 压力测试
- 容量测试
- 容量测试
- Endurance Testing
正则表达式
Given a text file, perform the following exercises
提炼
Extract all the numbers
Extract the first word of each line
"^w+" Bonus: extract the last word of each line
-
"w+(?=W*$)" (in most cases, depends on line formatting)
Extract all the IP addresses
- "b(?:d{1,3} .){3}d{1,3}b" IPV4:(This format looks for 1 to 3 digit sequence 3 times)
Extract dates in the format of yyyy-mm-dd or yyyy-dd-mm
Extract email addresses
- "b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+ .[A-Za-z]{2,}b"
代替
Replace tabs with four spaces
Replace 'red' with 'green'
系统设计
Explain what a "single point of failure" is.
A "single point of failure", in a system or organization, if it were to fail would cause the entire system to fail or significantly disrupt it's operation. In other words, it is a vulnerability where there is no backup in place to compensate for the failure.什么是CDN?
CDN (Content Delivery Network) responsible for distributing content geographically. Part of it, is what is known as edge locations, aka cache proxies, that allows users to get their content quickly due to cache features and geographical distribution.
Explain Multi-CDN
In single CDN, the whole content is originated from content delivery network.
In multi-CDN, content is distributed across multiple different CDNs, each might be on a completely different provider/cloud.
What are the benefits of Multi-CDN over a single CDN?
Resiliency: Relying on one CDN means no redundancy. With multiple CDNs you don't need to worry about your CDN being down- Flexibility in Costs: Using one CDN enforces you to specific rates of that CDN. With multiple CDNs you can take into consideration using less expensive CDNs to deliver the content.
- Performance: With Multi-CDN there is bigger potential in choosing better locations which more close to the client asking the content
- Scale: With multiple CDNs, you can scale services to support more extreme conditions
Explain "3-Tier Architecture" (including pros and cons)
A "3-Tier Architecture" is a pattern used in software development for designing and structuring applications. It divides the application into 3 interconnected layers: Presentation, Business logic and Data storage. PROS: * Scalability * Security * Reusability CONS: * Complexity * Performance overhead * Cost and development time Explain Mono-repo vs. Multi-repo.What are the cons and pros of each approach?
In a Mono-repo, all the code for an organization is stored in a single,centralized repository. PROS (Mono-repo): * Unified tooling * Code Sharing CONS (Mono-repo): * Increased complexity * Slower cloning In a Multi-repo setup, each component is stored in it's own separate repository. Each repository has it's own version control history. PROS (Multi-repo):
管理更简单- Different teams and developers can work on different parts of the project independently, making parallel development easier. CONS (Multi-repo):
- Code duplication
- 整合挑战
What are the drawbacks of monolithic architecture?
Not suitable for frequent code changes and the ability to deploy new features- Not designed for today's infrastructure (like public clouds)
- Scaling a team to work monolithic architecture is more challenging
- If a single component in this architecture fails, then the entire application fails.
What are the advantages of microservices architecture over a monolithic architecture?
Each of the services individually fail without escalating into an application-wide outage.- Each service can be developed and maintained by a separate team and this team can choose its own tools and coding language
什么是服务网格?
It is a layer that facilitates communication management and control between microservices in a containerized application. It handles tasks such as load balancing, encryption, and monitoring. Explain "Loose Coupling"
In "Loose Coupling", components of a system communicate with each other with a little understanding of each other's internal workings. This improves scalability and ease of modification in complex systems. What is a message queue?什么时候使用?
It is a communication mechanism used in distributed systems to enable asynchronous communication between different components. It is generally used when the systems use a microservices approach. 可扩展性
Explain Scalability
The ability easily grow in size and capacity based on demand and usage.
Explain Elasticity
The ability to grow but also to reduce based on what is required
Explain Disaster Recovery
Disaster recovery is the process of restoring critical business systems and data after a disruptive event. The goal is to minimize the impact and resume normal business activities quickly. This involves creating a plan, testing it, backing up critical data, and storing it in safe locations. In case of a disaster, the plan is then executed, backups are restored, and systems are hopefully brought back online. The recovery process may take hours or days depending on the damages of infrastructure. This makes business planning important, as a well-designed and tested disaster recovery plan can minimize the impact of a disaster and keep operations going.
Explain Fault Tolerance and High Availability
Fault Tolerance - The ability to self-heal and return to normal capacity. Also the ability to withstand a failure and remain functional.
High Availability - Being able to access a resource (in some use cases, using different platforms)
What is the difference between high availability and Disaster Recovery?
wintellect.com: "High availability, simply put, is eliminating single points of failure and disaster recovery is the process of getting a system back to an operational state when a system is rendered inoperative. In essence, disaster recovery picks up when high availability fails, so HA first."
Explain Vertical Scaling
Vertical Scaling is the process of adding resources to increase power of existing servers. For example, adding more CPUs, adding more RAM, etc.
What are the disadvantages of Vertical Scaling?
With vertical scaling alone, the component still remains a single point of failure. In addition, it has hardware limit where if you don't have more resources, you might not be able to scale vertically.
Which type of cloud services usually support vertical scaling?
Databases, cache. It's common mostly for non-distributed systems.
Explain Horizontal Scaling
Horizontal Scaling is the process of adding more resources that will be able handle requests as one unit
What is the disadvantage of Horizontal Scaling? What is often required in order to perform Horizontal Scaling?
A load balancer. You can add more resources, but if you would like them to be part of the process, you have to serve them the requests/responses. Also, data inconsistency is a concern with horizontal scaling.
Explain in which use cases will you use vertical scaling and in which use cases you will use horizontal scaling
Explain Resiliency and what ways are there to make a system more resilient
Explain "Consistent Hashing"
How would you update each of the services in the following drawing without having app (foo.com) downtime?
What is the problem with the following architecture and how would you fix it?
The load on the producers or consumers may be high which will then cause them to hang or crash.
Instead of working in "push mode", the consumers can pull tasks only when they are ready to handle them. It can be fixed by using a streaming platform like Kafka, Kinesis, etc. This platform will make sure to handle the high load/traffic and pass tasks/messages to consumers only when the ready to get them.
Users report that there is huge spike in process time when adding little bit more data to process as an input.可能是什么问题?
How would you scale the architecture from the previous question to hundreds of users?
缓存
What is "cache"? In which cases would you use it?
What is "distributed cache"?
What is a "cache replacement policy"?
看看这里
Which cache replacement policies are you familiar with?
You can find a list here
Explain the following cache policies:
在这里阅读相关内容
Why not writing everything to cache instead of a database/datastore?
Caching and databases serve different purposes and are optimized for different use cases. Caching is used to speed up read operations by storing frequently accessed data in memory or on a fast storage medium. By keeping data close to the application, caching reduces the latency and overhead of accessing data from a slower, more distant storage system such as a database or disk.
On the other hand, databases are optimized for storing and managing persistent data. Databases are designed to handle concurrent read and write operations, enforce consistency and integrity constraints, and provide features such as indexing and querying.
迁移
How you prepare for a migration? (or plan a migration)
You can mention:
roll-back & roll-forward cut over dress rehearsals DNS redirection
Explain "Branch by Abstraction" technique
设计一个系统
Can you design a video streaming website?
Can you design a photo upload website?
How would you build a URL shortener?
More System Design Questions
Additional exercises can be found in system-design-notebook repository.
硬件
什么是CPU?
A central processing unit (CPU) performs basic arithmetic, logic, controlling, and input/output (I/O) operations specified by the instructions in the program. This contrasts with external components such as main memory and I/O circuitry, and specialized processors such as graphics processing units (GPUs).
什么是内存?
RAM (Random Access Memory) is the hardware in a computing device where the operating system (OS), application programs and data in current use are kept so they can be quickly reached by the device's processor. RAM is the main memory in a computer. It is much faster to read from and write to than other kinds of storage, such as a hard disk drive (HDD), solid-state drive (SSD) or optical drive.
What is a GPU?
A GPU, or Graphics Processing Unit, is a specialized electronic circuit designed to expedite image and video processing for display on a computer screen.
What is an embedded system?
An embedded system is a computer system - a combination of a computer processor, computer memory, and input/output peripheral devices—that has a dedicated function within a larger mechanical or electronic system.它作为完整设备的一部分嵌入,通常包括电气或电子硬件和机械部件。
Can you give an example of an embedded system?
A common example of an embedded system is a microwave oven's digital control panel, which is managed by a microcontroller.
When committed to a certain goal, Raspberry Pi can serve as an embedded system.
What types of storage are there?
There are several types of storage, including hard disk drives (HDDs), solid-state drives (SSDs), and optical drives (CD/DVD/Blu-ray). Other types of storage include USB flash drives, memory cards, and network-attached storage (NAS).
What are some considerations DevOps teams should keep in mind when selecting hardware for their job?
Choosing the right DevOps hardware is essential for ensuring streamlined CI/CD pipelines, timely feedback loops, and consistent service availability. Here's a distilled guide on what DevOps teams should consider:
Understanding Workloads :
- CPU : Consider the need for multi-core or high-frequency CPUs based on your tasks.
- RAM : Enough memory is vital for activities like large-scale coding or intensive automation.
- Storage : Evaluate storage speed and capacity. SSDs might be preferable for swift operations.
可扩展性:
- Horizontal Growth : Check if you can boost capacity by adding more devices.
- Vertical Growth : Determine if upgrades (like RAM, CPU) to individual machines are feasible.
Connectivity Considerations :
- Data Transfer : Ensure high-speed network connections for activities like code retrieval and data transfers.
- Speed : Aim for low-latency networks, particularly important for distributed tasks.
- Backup Routes : Think about having backup network routes to avoid downtimes.
Consistent Uptime :
- Plan for hardware backups like RAID configurations, backup power sources, or alternate network connections to ensure continuous service.
System Compatibility :
- Make sure your hardware aligns with your software, operating system, and intended platforms.
电源效率:
- Hardware that uses energy efficiently can reduce costs in long-term, especially in large setups.
安全措施:
- Explore hardware-level security features, such as TPM, to enhance protection.
Overseeing & Control :
- Tools like ILOM can be beneficial for remote handling.
- Make sure the hardware can be seamlessly monitored for health and performance.
Budgeting :
- Consider both initial expenses and long-term costs when budgeting.
Support & Community :
- Choose hardware from reputable vendors known for reliable support.
- Check for available drivers, updates, and community discussions around the hardware.
Planning Ahead :
- Opt for hardware that can cater to both present and upcoming requirements.
Operational Environment :
- Temperature Control : Ensure cooling systems to manage heat from high-performance units.
- Space Management : Assess hardware size considering available rack space.
- Reliable Power : Factor in consistent and backup power sources.
Cloud Coordination :
- If you're leaning towards a hybrid cloud setup, focus on how local hardware will mesh with cloud resources.
Life Span of Hardware :
- Be aware of the hardware's expected duration and when you might need replacements or upgrades.
Optimized for Virtualization :
- If utilizing virtual machines or containers, ensure the hardware is compatible and optimized for such workloads.
适应性:
- Modular hardware allows individual component replacements, offering more flexibility.
Avoiding Single Vendor Dependency :
- Try to prevent reliance on a single vendor unless there are clear advantages.
Eco-Friendly Choices :
- Prioritize sustainably produced hardware that's energy-efficient and environmentally responsible.
In essence, DevOps teams should choose hardware that is compatible with their tasks, versatile, gives good performance, and stays within their budget. Furthermore, long-term considerations such as maintenance, potential upgrades, and compatibility with impending technological shifts must be prioritized.
What is the role of hardware in disaster recovery planning and implementation?
Hardware is critical in disaster recovery (DR) solutions. While the broader scope of DR includes things like standard procedures, norms, and human roles, it's the hardware that keeps business processes running smoothly. Here's an outline of how hardware works with DR:
Storing Data and Ensuring Its Duplication :
- Backup Equipment : Devices like tape storage, backup servers, and external HDDs keep essential data stored safely at a different location.
- Disk Arrays : Systems such as RAID offer a safety net. If one disk crashes, the others compensate.
Alternate Systems for Recovery :
- Backup Servers : These step in when the main servers falter, maintaining service flow.
- Traffic Distributors : Devices like load balancers share traffic across servers. If a server crashes, they reroute users to operational ones.
Alternate Operation Hubs :
- Ready-to-use Centers : Locations equipped and primed to take charge immediately when the main center fails.
- Basic Facilities : Locations with necessary equipment but lacking recent data, taking longer to activate.
- Semi-prepped Facilities : Locations somewhat prepared with select systems and data, taking a moderate duration to activate.
Power Backup Mechanisms :
- Instant Power Backup : Devices like UPS offer power during brief outages, ensuring no abrupt shutdowns.
- Long-term Power Solutions : Generators keep vital systems operational during extended power losses.
Networking Equipment :
- Backup Internet Connections : Having alternatives ensures connectivity even if one provider faces issues.
- Secure Connection Tools : Devices ensuring safe remote access, especially crucial during DR situations.
On-site Physical Setup :
- Organized Housing : Structures like racks to neatly store and manage hardware.
- Emergency Temperature Control : Backup cooling mechanisms to counter server overheating in HVAC malfunctions.
Alternate Communication Channels :
- Orbit-based Phones : Handy when regular communication methods falter.
- Direct Communication Devices : Devices like radios useful when primary systems are down.
Protection Mechanisms :
- Electronic Barriers & Alert Systems : Devices like firewalls and intrusion detection keep DR systems safeguarded.
- Physical Entry Control : Systems controlling entry and monitoring, ensuring only cleared personnel have access.
Uniformity and Compatibility in Hardware :
- It's simpler to manage and replace equipment in emergencies if hardware configurations are consistent and compatible.
Equipment for Trials and Upkeep :
- DR drills might use specific equipment to ensure the primary systems remain unaffected. This verifies the equipment's readiness and capacity to manage real crises.
In summary, while software and human interventions are important in disaster recovery operations, it is the hardware that provides the underlying support. It is critical for efficient disaster recovery plans to keep this hardware resilient, duplicated, and routinely assessed.
What is a RAID?
RAID is an acronym that stands for "Redundant Array of Independent Disks." It is a technique that combines numerous hard drives into a single device known as an array in order to improve performance, expand storage capacity, and/or offer redundancy to prevent data loss. RAID levels (for example, RAID 0, RAID 1, and RAID 5) provide varied benefits in terms of performance, redundancy, and storage efficiency.
什么是微控制器?
A microcontroller is a small integrated circuit that controls certain tasks in an embedded system. It typically includes a CPU, memory, and input/output peripherals.
What is a Network Interface Controller or NIC?
A Network Interface Controller (NIC) is a piece of hardware that connects a computer to a network and allows it to communicate with other devices.
What is a DMA?
Direct memory access (DMA) is a feature of computer systems that allows certain hardware subsystems to access main system memory independently of the central processing unit (CPU).DMA enables devices to share and receive data from the main memory in a computer. It does this while still allowing the CPU to perform other tasks.
What is a Real-Time Operating Systems?
A real-time operating system (RTOS) is an operating system (OS) for real-time computing applications that processes data and events that have critically defined time constraints. An RTOS is distinct from a time-sharing operating system, such as Unix, which manages the sharing of system resources with a scheduler, data buffers, or fixed task prioritization in a multitasking or multiprogramming environment. Processing time requirements need to be fully understood and bound rather than just kept as a minimum. All processing must occur within the defined constraints. Real-time operating systems are event-driven and preemptive, meaning the OS can monitor the relevant priority of competing tasks, and make changes to the task priority. Event-driven systems switch between tasks based on their priorities, while time-sharing systems switch the task based on clock interrupts.
List of interrupt types
There are six classes of interrupts possible:
外部的- 机器检查
- 输入/输出
- 程序
- 重新启动
- Supervisor call (SVC)
大数据
Explain what is exactly Big Data
As defined by Doug Laney:
Volume: Extremely large volumes of data- Velocity: Real time, batch, streams of data
- Variety: Various forms of data, structured, semi-structured and unstructured
- Veracity or Variability: Inconsistent, sometimes inaccurate, varying data
What is DataOps? How is it related to DevOps?
DataOps seeks to reduce the end-to-end cycle time of data analytics, from the origin of ideas to the literal creation of charts, graphs and models that create value. DataOps combines Agile development, DevOps and statistical process controls and applies them to data analytics.
什么是数据架构?
An answer from talend.com:
"Data architecture is the process of standardizing how organizations collect, store, transform, distribute, and use data. The goal is to deliver relevant data to people who need it, when they need it, and help them make sense of it."
Explain the different formats of data
Structured - data that has defined format and length (eg numbers, words)- Semi-structured - Doesn't conform to a specific format but is self-describing (eg XML, SWIFT)
- Unstructured - does not follow a specific format (eg images, test messages)
什么是数据仓库?
Wikipedia's explanation on Data Warehouse Amazon's explanation on Data Warehouse
What is Data Lake?
Data Lake - Wikipedia
Can you explain the difference between a data lake and a data warehouse?
What is "Data Versioning"? What models of "Data Versioning" are there?
什么是 ETL?
Apache Hadoop
Explain what is Hadoop
Apache Hadoop - Wikipedia
Explain Hadoop YARN
Responsible for managing the compute resources in clusters and scheduling users' applications
Explain Hadoop MapReduce
A programming model for large-scale data processing
Explain Hadoop Distributed File Systems (HDFS)
Distributed file system providing high aggregate bandwidth across the cluster.- For a user it looks like a regular file system structure but behind the scenes it's distributed across multiple machines in a cluster
- Typical file size is TB and it can scale and supports millions of files
- It's fault tolerant which means it provides automatic recovery from faults
- It's best suited for running long batch operations rather than live analysis
What do you know about HDFS architecture?
HDFS Architecture
Master-slave architecture- Namenode - master, Datanodes - slaves
- Files split into blocks
- Blocks stored on datanodes
- Namenode controls all metadata
头孢
Explain what is Ceph
Ceph is an Open-Source Distributed Storage System designed to provide excellent performance, reliability, and scalability. It's often used in cloud computing environments and Data Centers.是真是假? Ceph favor consistency and correctness over performances
真的Which services or types of storage Ceph supports?
Object (RGW)- Block (RBD)
- File (CephFS)
What is RADOS?
Reliable Autonomic Distributed Object Storage- Provides low-level data object storage service
- Strong Consistency
- Simplifies design and implementation of higher layers (block, file, object)
Describe RADOS software components
监视器- Central authority for authentication, data placement, policy
- Coordination point for all other cluster components
- Protect critical cluster state with Paxos
- 经理
- Aggregates real-time metrics (throughput, disk usage, etc.)
- Host for pluggable management functions
- 1 active, 1+ standby per cluster
- OSD (Object Storage Daemon)
Stores data on an HDD or SSD
- Services client IO requests
What is the workflow of retrieving data from Ceph?
The work flow is as follows: - The client sends a request to the ceph cluster to retrieve data:
Client could be any of the following
- Ceph Block Device
- Ceph Object Gateway
- Any third party ceph client
- The client retrieves the latest cluster map from the Ceph Monitor
- The client uses the CRUSH algorithm to map the object to a placement group. The placement group is then assigned to a OSD.
- Once the placement group and the OSD Daemon are determined, the client can retrieve the data from the appropriate OSD
What is the workflow of writing data to Ceph?
The work flow is as follows: - The client sends a request to the ceph cluster to retrieve data
- The client retrieves the latest cluster map from the Ceph Monitor
- The client uses the CRUSH algorithm to map the object to a placement group. The placement group is then assigned to a Ceph OSD Daemon dynamically.
- The client sends the data to the primary OSD of the determined placement group. If the data is stored in an erasure-coded pool, the primary OSD is responsible for encoding the object into data chunks and coding chunks, and distributing them to the other OSDs.
What are "Placement Groups"?
Describe in the detail the following: Objects -> Pool -> Placement Groups -> OSDs
What is OMAP?
What is a metadata server?它是如何运作的?
包装机
打包机是什么?它有什么用?
In general, Packer automates machine images creation. It allows you to focus on configuration prior to deployment while making the images. This allows you start the instances much faster in most cases.
Packer follows a "configuration->deployment" model or "deployment->configuration"?
A configuration->deployment which has some advantages like:
Deployment Speed - you configure once prior to deployment instead of configuring every time you deploy. This allows you to start instances/services much quicker.- More immutable infrastructure - with configuration->deployment it's not likely to have very different deployments since most of the configuration is done prior to the deployment. Issues like dependencies errors are handled/discovered prior to deployment in this model.
发布
Explain Semantic Versioning
This page explains it perfectly:
Given a version number MAJOR.MINOR.PATCH, increment the:
MAJOR version when you make incompatible API changes
MINOR version when you add functionality in a backwards compatible manner
PATCH version when you make backwards compatible bug fixes
Additional labels for pre-release and build metadata are available as extensions to the MAJOR.MINOR.PATCH format.
证书
If you are looking for a way to prepare for a certain exam this is the section for you. Here you'll find a list of certificates, each references to a separate file with focused questions that will help you to prepare to the exam.祝你好运 :)
AWS
- Cloud Practitioner (Latest update: 2020)
- Solutions Architect Associate (Latest update: 2021)
- Cloud SysOps Administration Associate (Latest update: Oct 2022)
天蓝色
- AZ-900 (Latest update: 2021)
库伯内斯
- Certified Kubernetes Administrator (CKA) (Latest update: 2022)
Additional DevOps and SRE Projects
制作人员
Thanks to all of our amazing contributors who make it easy for everyone to learn new things :)
Logos credits can be found here
执照