[Dr.Lib]Note:Algorithm – 莫队算法

莫队算法:对于两个区间的查询[l1,r1] ,[l2,r2]如果每增加一个区间元素或者删除,都能做到O(1)的话
那么从[l1,r1]转移到[l2,r2],暴力可以做到|l1-l2|+|r1-r2|,就是manhattan距离
莫队的论文一直没有找到,所以只是大致的了解下,应该是证明出构造出哈密尔顿路径是最优的。
但是可以用manhattan mst来构造,大约会是两倍,然后莫队证明出这样转移的上限是O(n*sqrt(n))。
所以对于这种无修改的区间查询来说
可以先将所有的区间,看成二维平面上的点,求一次manhattan mst,然后根据mst来进行转移
相邻的两个区间的查询转移,暴力解决。

Via http://blog.csdn.net/huzecong/article/details/8576908

http://blog.csdn.net/acm_cxlove/article/details/8894431

在此感谢

据说这个算法是莫涛提出的(Orz!),但是在网上到处都搜不到相关资料,最后问pty才知道的。这个算法是用于处理一类不带修改的区间查询问题的离线算法,其核心在于利用曼哈顿距离最小生成树算法对区间处理顺序进行处理。比如下面这个例题(清橙A1206《小Z的袜子》,就是莫队出的题):

给定一个长为N的序列,每个元素的值是其颜色。有M次询问,每次询问从一个区间中随机选取两个元素同色的概率。

一次询问[l,r]的答案即,其中是区间中第i中颜色的个数。显然暴力是O(NM)的,而且一般的区间问题的思路似乎不适用。

我们先考虑一个简化的问题:所有的查询区间的左端点都是1。那么我们可以按右端点排序,假设已经处理出了[1,r]的答案,考虑转移到[1,r+k],即添加k个元素,这个可以在O(k)的复杂度内求出。那么处理所有区间的复杂度(不考虑排序)就是O(N)。

那么如果是从[l,r]转移到[l’,r’]呢?复杂度即O(|r’-r|+|l’-l|),也即点(l,r)到点(l’,r’)的曼哈顿距离。那么如果将所有询问转化成二维平面中的点,求曼哈顿距离最小生成树,再按照生成树的顺序做,就可以最小化区间之间转移的复杂度。可以证明(我不会证……似乎莫队的论文里有),这样做的复杂度是O(N1.5)的。问题也就得到了解决。

[Dr.Lib]Note:Algorithm – Kruskal

 

[Dr.Lib]Note:Math – Extend Baby Step Giant Step

这种数学题都是转AC大神的……以及Seter

http://hi.baidu.com/aekdycoin/item/236937318413c680c2cf29d4

http://seter.is-programmer.com/posts/32188.html

Baby Step Giant Step

\[A^{x}=B (\mod C) 0\leq x <C,C\in \mathbb{P}\]

求解x,我们可以做一个等价

\[x = i\ast m + j  ( 0 \leq i,j< m) ,m = \lceil \sqrt{C} \rceil\]

而这么分解的目的无非是为了转化为:

\[(A^{i})^{m} * A^{j} = B \mod C\]

之后做少许暴力的工作就可以解决问题:
(1) for i = 0 -> m, 插入\( Hash (A^i \mod C, i) \)

(2) 枚举 i ,对于每一个枚举到的i,令  \( {A}' = (A^m)^i \mod C\)
我们有\[A' * A^j = B \mod C\]

显然\(A'B,C\),均已知,而由于C为素数,那么(AA,C)无条件为1

令\(p=(A^M)^i \mod C,x=A^j \mod C\)则\(p*x=B\mod C\)。

我们从0开始枚举i直到i*M超过C-1,则相当于知道了p,要求x。

由于 (A',C)=1 ,于是对于这个模方程解的个数唯一(可以利用扩展欧几里得或 欧拉定理来求解)

那么对于得到的唯一解x,在Hash表中寻找(x,j),如果找到,则返回  \(i*m+j>0\) 。

如果需要得到 x > 0的解,那么只需要在上面的步骤中判断 当 \(i*m+j>0\) 的时候才返回。

扩展Baby Step Giant Step

\[A^{x}=B (\mod C) 0\leq x <C,C\in \mathbb{N^*}\]

当C不是素数的时候能否直接套用呢?当然不可以……最直接的问题就是,不一定存在逆元!

考虑一个\(G'\)同时是ABC的因数,令\(B'=\frac{B}{G'},C'=\frac{C}{G'}\),则当x不等于0时,

\(\frac{A}{G'}*A^(x-1)=B'\mod  C'\)。

这样多弄几次,C的因数就越来越少,直到(A,C)=1。

那么如何选取\(G'\)呢?AC大神告诉你:不断取\(G'=gcd(A,C')\),直到\(G'\)=1。如果任意一个\(G'\)不是\(B'\)的因数则一定无解。

假设取了r次\(G'\),然后所有\(G'\)的积是G,则问题变为\(\frac{A^r}{G}*A^{x-r}=B'\mod C'\),令\(D=\frac{A^r}{G}\)(一定是整数),则由于此时\((A,C')=1\),所以\((D,C')=1\)。

那么上面算法中只要变成求\(D\)对于\(C'\)的逆元就可以了,返回的答案还要加上r。

还有一点小问题,就是这样得出的答案是大于等于r的。但是即使每次\(G'=2\),r最大也只有\(log_{2}C\),那么这些再暴力求解就可以了。

[Dr.Lib]Note:Algorithm – Manhattan minimum spanning tree

Via http://community.topcoder.com/tc?module=Static&d1=tutorials&d2=lineSweep

We first break this down into a simpler problem. Standard MST algorithms for general graphs (e.g., Prim's algorithm) can compute the MST in O((E + N) log N) time for E edges. If we can exploit geometric properties to reduce the number of edges to O(N), then this is merely O(N log N). In fact we can consider, for each point P, only its nearest neighbors in each of the 8 octants of the plane (see the figure below). The figure shows the situation in just one of the octants, the West-Northwest one. Q is the closest neighbour (with the dashed line indicating points at the same Manhattan distance as Q), and R is some other point in the octant. If PR is an edge in a spanning tree, then it can be removed and replaced by either PQ or QR to produce a better spanning tree, because the shape of the octant guarantees that |QR| ≤ |PR|. Thus, we do not need to consider PR when building the spanning tree.

Octants

This reduces the problem to that of finding the nearest neighbour in each octant. We'll just consider the octant shown; the others are no different and can be handled by symmetry. It should be clear that within this octant, finding the nearest neighbour is equivalent to just finding the point with the largest value of x − y, subject to an upper bound on x + y and a lower bound on y, and this is the form in which we'll consider the problem.

Now imagine for the moment that the lower bound on y did not exist. In this case we could solve the problem for every P quite easily: sweep through the points in increasing order of x + y, and Q will be the point with the largest x − y value of those seen so far. This is where the divide-and-conquer principle comes into play: we partition the point set into two halves with a horizontal line, and recursively solve the problem for each half. For points P in the upper half, nothing further needs to be done, because points in the bottom half cannot play Q to their P. For the bottom half, we have to consider that by ignoring the upper half so far we may have missed some closer points. However, we can take these points into account in a similar manner as before: walk through all the points in x + y order, keeping track of the best point in the top half (largest x − y value), and for each point in the bottom half, checking whether this best top-half point is better than the current neighbour.

So far I have blithely assumed that any set of points can be efficiently partitioned on Y and also walked in x + y order without saying how this should be done. In fact, one of the most beautiful aspects of this class of divide-and-conquer plus line-sweep algorithms is that it has essentially the same structure as a merge sort, to the point that a merge-sort by x + y can be folded into the algorithm in such a way that each subset is sorted on x + y just when this is needed (the points initially all being sorted on Y). This gives the algorithm a running time of O(N log N).

The idea of finding the closest point within an angle range can also be used to solve the Euclidean MST problem, but the O(N log N) running time is no longer guaranteed in the worst cases, because the distance is no longer a linear equation. It is actually possible to compute the Euclidean MST in O(N log N) time, because it is a subset of the Delaunay triangulation.

[Dr.Lib]Note:Data Structures – Bottom-up SPLAY

Let us play!

Let u splay!

 

BST 平均 O(logn)

Treap 期望 O(logn)

Splay 均摊 O(logn)

AVL 严格 O(logn)

SBT 严格 O(logn)

AVL不大指望能在赛场写出来,SBT在绝大多数时候是个不错的选择,Treap…嗯……但偏偏Splay被人看成是BST中的万金油……为什么呐?

Splay一个最大的特点就是:每一个操作都要splay一下……除此之外没了……

Splay

splay操作的作用就是把某一节点旋转到根,以加快之后的访问速度。它还可以在旋转过程中维持树的基本平衡。

  1. zig: 当目标节点是根节点的 左子节点 或 右子节点 时,进行一次zig,将目标节点调整到根节点的位置。
    709px-Splay_tree_zig.svg
  2. zig-zig:当目标节点是根节点的 左(右)子节点的左(右)子节点 时,进行一次zig-zig,将目标节点调整到根节点位置。
    Zigzig
  3. zig-zag:当目标节点是根节点的 左(右)子节点的右(左)子节点 时,进行一次zig-zag,将目标节点调整到根节点位置。
    Zigzag

 

重复这几种操作直到目标节点为根。

一次zig、zig-zig、zig-zag又可以分解为几次单旋。

Search/Insert/Rank/Select

与BST相同,只不过记得把插入的节点Splay到根。

Join

作用是把两棵子树S1,S2连成一个Splay,其中S1中所有元素小于S2。

先找到S1中的最大节点x,Splay到S1的根,再把x的右子树链接到S2。

Delete

先找到目标节点,Splay到根,执行Join操作即可。

代码