Japin

登高必自卑,行远必自迩

最近 Robert Haas 提交了一个关于 pg_basebackup 指定备份存储位置的功能,详细信息如下:

commit 3500ccc39b0dadd1068a03938e4b8ff562587ccc
Author: Robert Haas <rhaas@postgresql.org>
Date:   Tue Nov 16 15:20:50 2021 -0500

    Support base backup targets.

    pg_basebackup now has a --target=TARGET[:DETAIL] option. If specfied,
    it is sent to the server as the value of the TARGET option to the
    BASE_BACKUP command. If DETAIL is included, it is sent as the value of
    the new TARGET_DETAIL option to the BASE_BACKUP command.  If the
    target is anything other than 'client', pg_basebackup assumes that it
    will now be the server's job to write the backup in a location somehow
    defined by the target, and that it therefore needs to write nothing
    locally. However, the server will still send messages to the client
    for progress reporting purposes.

    On the server side, we now support two additional types of backup
    targets.  There is a 'blackhole' target, which just throws away the
    backup data without doing anything at all with it. Naturally, this
    should only be used for testing and debugging purposes, since you will
    not actually have a backup when it finishes running. More usefully,
    there is also a 'server' target, so you can now use something like
    'pg_basebackup -Xnone -t server:/SOME/PATH' to write a backup to some
    location on the server. We can extend this to more types of targets
    in the future, and might even want to create an extensibility
    mechanism for adding new target types.

    Since WAL fetching is handled with separate client-side logic, it's
    not part of this mechanism; thus, backups with non-default targets
    must use -Xnone or -Xfetch.

    Patch by me, with a bug fix by Jeevan Ladhe.  The patch set of which
    this is a part has also had review and/or testing from Tushar Ahuja,
    Suraj Kharage, Dipesh Pandit, and Mark Dilger.

    Discussion: http://postgr.es/m/CA+TgmoaYZbz0=Yk797aOJwkGJC-LK3iXn+wzzMx7KdwNpZhS5g@mail.gmail.com
阅读全文 »

PostgreSQL 15 新增了 jsonlog 日志选项,即可以将日志存储为 json 格式,下面是详细的提交信息。

commit dc686681e0799b12c40f44f85fc5bfd7fed4e57f
Author: Michael Paquier <michael@paquier.xyz>
Date:   Mon Jan 17 10:16:53 2022 +0900

    Introduce log_destination=jsonlog

    "jsonlog" is a new value that can be added to log_destination to provide
    logs in the JSON format, with its output written to a file, making it
    the third type of destination of this kind, after "stderr" and
    "csvlog".  The format is convenient to feed logs to other applications.
    There is also a plugin external to core that provided this feature using
    the hook in elog.c, but this had to overwrite the output of "stderr" to
    work, so being able to do both at the same time was not possible.  The
    files generated by this log format are suffixed with ".json", and use
    the same rotation policies as the other two formats depending on the
    backend configuration.

    This takes advantage of the refactoring work done previously in ac7c807,
    bed6ed3, 8b76f89 and 2d77d83 for the backend parts, and 72b76f7 for the
    TAP tests, making the addition of any new file-based format rather
    straight-forward.

    The documentation is updated to list all the keys and the values that
    can exist in this new format.  pg_current_logfile() also required a
    refresh for the new option.

    Author: Sehrope Sarkuni, Michael Paquier
    Reviewed-by: Nathan Bossart, Justin Pryzby
    Discussion: https://postgr.es/m/CAH7T-aqswBM6JWe4pDehi1uOiufqe06DJWaU5=X7dDLyqUExHg@mail.gmail.com
阅读全文 »

最近在浏览邮件列表时发现通过 pg_dump 导出带有行类型的规则导出之后无法导入到数据库中,已确认为 bug,目前已被修复。

commit 43c2175121c829c8591fc5117b725f1f22bfb670
Author: Tom Lane <tgl@sss.pgh.pa.us>
Date:   Thu Jan 13 17:49:25 2022 -0500

    Fix ruleutils.c's dumping of whole-row Vars in more contexts.

    Commit 7745bc352 intended to ensure that whole-row Vars would be
    printed with "::type" decoration in all contexts where plain
    "var.*" notation would result in star-expansion, notably in
    ROW() and VALUES() constructs.  However, it missed the case of
    INSERT with a single-row VALUES, as reported by Timur Khanjanov.

    Nosing around ruleutils.c, I found a second oversight: the
    code for RowCompareExpr generates ROW() notation without benefit
    of an actual RowExpr, and naturally it wasn't in sync :-(.
    (The code for FieldStore also does this, but we don't expect that
    to generate strictly parsable SQL anyway, so I left it alone.)

    Back-patch to all supported branches.

    Discussion: https://postgr.es/m/efaba6f9-4190-56be-8ff2-7a1674f9194f@intrans.baku.az
阅读全文 »

了解 PostgreSQL 锁对于构建可扩展的应用程序和避免停机很重要。现代计算机和服务器有许多 CPU 内核,可以并行执行多个查询。包含许多由查询或并行运行的后台进程进行更改的一致结构的数据库可能会使数据库崩溃甚至损坏数据。因此,我们需要能够防止来自并发进程的访问,同时更改共享内存结构或行。一个线程更新结构而所有其他线程等待(排他锁),或者多个线程读取结构并且所有写入等待。等待的副作用是锁争用和服务器资源浪费。因此,重要的是要了解为什么会发生等待以及涉及哪些锁。在本文中,我回顾了 PostgreSQL 行级锁(Row Level Locking)。

在后续文章中,我将研究保护内部数据库结构的表级锁(Table Level Locks)latches 锁

阅读全文 »

以下信息侧重于使用 GNU C 库 (glibc) 的操作系统,其中包括最流行的 Linux 发行版。所有版本的 PostgreSQL 都会受到影响。其他操作系统原则上可能存在相同的问题,但我们尚未收集任何相关信息。

PostgreSQL 使用操作系统的 C 库提供的语言环境数据对文本进行排序。排序发生在各种上下文中,包括用户输出、合并连接、B 树索引和范围分区。在后两种情况下,排序后的数据被持久化到磁盘。如果 C 库中的语言环境在数据库的生命周期中发生变化,则持久化的数据可能会与预期的排序顺序不一致,从而导致错误的查询结果和其他不正确的行为。例如,如果索引未按照索引扫描所期望的方式进行排序,则查询可能无法找到实际存在的数据,并且更新可能会插入不应允许的重复数据。同样,在分区表中,查询可能会在错误的分区中查找,而更新可能会写入错误的分区。因此,对于数据库的正确操作,避免语言环境在数据库的生命周期内发生不兼容的变化是至关重要的。

操作系统供应商,尤其是 GNU C 库的作者,不时地以较小的方式更改语言环境以纠正错误或添加对更多语言的支持。虽然这在理论上违反了上述规则,但从历史上看,它影响的用户很少,也没有受到广泛关注。但是,在 2018-08-01 发布的 glibc 版本 2.28 中,包含了对语言环境数据的重大更新,这可能会影响许多用户的数据。需要注意的是,更新本身是合法的,因为它使语言环境符合当前的国际标准。但是,如果将这些更新应用于现有的 PostgreSQL 系统,则必然会出现问题。

操作系统供应商负责将 glibc 更新集成到 Linux 发行版中。我们希望长期支持 Linux 发行版的供应商不会在给定版本中对其发行版应用不兼容的语言环境更新,但这只是一种预期,因为我们无法预测或影响未来的行动。此外,PostgreSQL 目前无法检测到不兼容的 glibc 更新。因此,在规划任何更新或升级时需要一些手动操作。

阅读全文 »

最近,应用开发报来一个问题,错误信息如下:

1
2
ERROR:  could not read block 0 in file "base/16385/2294016": read only 0 of 8192 bytes
CONTEXT: SQL function "obj_description" during startup

这个错误信息和之前遇到的 PostgreSQL HASH 索引拾遗一文的情况很像。但是这里的数据库是 10.4,且是单节点模式,因此排除 HASH 索引的问题。

阅读全文 »

在 PostgreSQL 中,查询一般会经历以下几个阶段:

  1. 应用程序连接到 PostgreSQL 并发送查询请求,随后等待返回结果。
  2. 解析器(parser)检查应用程序传输的查询的语法是否正确并创建查询树(query tree)。
  3. 重写系统(rewrite system)采用解析器阶段创建的查询树,并查找存储在系统表中的规则来重写查询树,例如视图。
  4. 规划器/优化器(planner/optimizer)采用(重写的)查询树并创建一个查询计划,该计划将作为执行器(executor)的输入。
    它首先创建得到相同结果的所有路径。例如,在表上有索引,那么将会创建两个查询路径:顺序扫描和索引扫描。接下来估计每条路径的执行成本并选择最便宜的路径,将最便宜的路径扩展为执行器可以使用的完整计划。
  5. 执行器递归地遍历计划树并以计划表示的方式检索行。执行器在扫描表时会使用存储系统、执行排序和连接、估算条件并最后归还得到的行。
阅读全文 »

pg_cron 是 PostgreSQL 数据库的一个作业调度器插件,通过该插件我们可以定时的执行一些特殊的任务。它遵循 Linux crontab 的配置语法,您可以通过在线工具 crontab.guru 来验证您的 cron 表达式。目前,pg_cron 仅支持 PostgreSQL 10 及其之后的版本。本文将简要介绍 pg_cron 的安装及其使用。

阅读全文 »
0%