[Dr.Lib]Note:Bio – 为什么植物不吸收绿色光?

Via http://www.guokr.com/question/455259/

为什么植物不吸收绿色光?

绿色光应该是太阳光谱的中心,能量是最多的吧?
按理说,只要能光合作用,所有的光波都吸收,一点也不浪费。
为什么要浪费掉绿光呢?
为什么叶子不进化成黑色呢?

Ans:

Via 九维空间

根本上应该从叶绿素的分子结构考虑,在绿光波段(波长500-550纳米左右区域)吸收较弱。下图就是叶绿素分子的吸收图。可以看出从蓝光到黄光基本都被反射了。绿光是中心波长,所以富含叶绿素的植物组织呈现绿色。

叶绿素吸收光谱

叶绿素吸收光谱

 

从物理上来说,大部分有机分子在可见光波段的中间部分吸收都比较弱。因为:1紫光和紫外光的频率一般对应于分子里成键电子能级的跃迁,2红光和红外光一般对应于分子振动能级的跃迁。3而分子的转动能级跃迁对应于微波频率。以上就是分子所有的三种能级跃迁,只有少部分已知分子有对应于500-600纳米波长的可见光部分的电子能级跃迁。

也许根本就不存在在绿光波段有强吸收,并且能作为光合作用催化剂的分子,至少现在没找到。

PS:到是有很多原子的价电子能级跃迁对应于可见光,因为一个原子核形成的中心势阱比分子里面多个原子核形成的中心势阱要浅,所以电子能级间隔会小很多。

[Dr.Lib]Note:Math – 条件概率与遗传

条件概率

条件概率就是事件A在另外一个事件B已经发生条件下的发生概率。条件概率表示为P(A|B),读作“在B条件下A的概率”。
联合概率表示两个事件共同发生的概率。A与B的联合概率表示为或者。
边缘概率是某个事件发生的概率。边缘概率是这样得到的:在联合概率中,把最终结果中不需要的那些事件合并成其事件的全概率而消失(对离散随机变量用求和得全概率,对连续随机变量用积分得全概率)。这称为边缘化(marginalization)。A的边缘概率表示为P(A),B的边缘概率表示为P(B)。
需要注意的是,在这些定义中A与B之间不一定有因果或者时间顺序关系。A可能会先于B发生,也可能相反,也可能二者同时发生。A可能会导致B的发生,也可能相反,也可能二者之间根本就没有因果关系。
例如考虑一些可能是新的信息的概率条件性可以通过贝叶斯定理实现。

定义

在同一个样本空间Ω中的事件或者子集A与B,如果随机从Ω中选出的一个元素属于B,那么这个随机选择的元素还属于A的概率就定义为在B的前提下A的条件概率。从这个定义中,我们可以得出

P(A|B) = |A∩B|/|B|

分子、分母都除以|Ω|得到

\(P(A|B) = \frac{P(A \cap B)}{P(B)}\)

有时候也称为:后验概率

统计独立性

当且仅当两个随机事件A与B满足

\(P(A \cap B) \ = \ P(A) P(B)\)。

的时候,它们才是统计独立的,这样联合概率可以表示为各自概率的简单乘积。

同样,对于两个独立事件A与B有

\(P(A|B) \ = \ P(A)\)

以及

\(P(B|A) \ = \ P(B)\)。

换句话说,如果A与B是相互独立的,那么A在B这个前提下的条件概率就是A自身的概率;同样,B在A的前提下的条件概率就是B自身的概率。

互斥性

当且仅当A与B满足

\(P(A \cap B) = 0\)

\(P(A) \ne 0\),\(P(B) \ne 0\)

的时候,A与B是互斥的。

因此,

\(P(A\mid B) = 0\)
\(P(B\mid A) = 0\)。

换句话说,如果B已经发生,由于A不能B在同一场合下发生,那么A发生的概率为零;同样,如果A已经发生,那么B发生的概率为零。

其它

    • 如果事件B的概率\(P(B) > 0\),那么\(Q(A) = P(A|B)\)在所有事件A上所定义的函数Q就是概率测度
    • 如果\(P(B)=0\),\(P(A|B)\)没有定义。
    • 条件概率可以用决策树进行计算。

(原来我的那种方法叫决策树啊……)

形式定义

考虑概率空间Ω(S, σ(S)),其中σ(S)是集S上的σ代数,Ω上对应于随机变量X的概率测度(可以理解为概率分布)为PX;又A∈σ(S),PX(A)≥0(这里可以理解为事件A,A不是零测集)。则∀E∈σ(S),可以定义集函数PX|A如下:

PX|A(E)=PX(A∩E)/PX(E)。

易知PX|A也是Ω上的概率测度,此测度称为X在A下的条件测度(条件概率分布)。

独立性:设A,B∈σ(S),称A,B在概率测度P下为相互独立的,若P(A∩E)=P(A)P(E)。

贝叶斯定理

贝叶斯定理是关于随机事件A和B的条件概率(或边缘概率)的一则定理。

\(P(A|B) = \frac{P(B | A)\, P(A)}{P(B)}\)

其中P(A|B)是在B发生的情况下A发生的可能性。

在贝叶斯定理中,每个名词都有约定俗成的名称:

按这些术语,Bayes定理可表述为:

后验概率 = (相似度*先验概率)/标准化常量

也就是说,后验概率与先验概率和相似度的乘积成正比。

另外,比例P(B|A)/P(B)也有时被称作标准相似度(standardised likelihood),Bayes定理可表述为:

后验概率 = 标准相似度*先验概率

遗传问题的计算

Example1:

已知甲病为由隐性基因a控制的遗传病,一父本的基因为aa,母本表现型正常,基因型有1/3的可能性是AA,有2/3的可能性是Aa,问子代一对异卵双胞胎同时患甲病的概率是多少?

Ans:

\(\frac{1}{6}\)

\(P(C1)=P(Aa)*\frac{1}{2}=\frac{1}{3}\)

\(P(C2|C1)= \frac{1}{2} \)(此时母本必为Aa)

\(P(C2|C1)= \frac{ P(C1\bigcap C2)}{P(C1)}\)

\( P(C1\bigcap C2)=P(C2|C1) \cdot P(C1) =\frac{1}{6}\)

 

P(C1)和P(C2)分别为两个孩子患病的概率。P(Aa)为母本为Aa的概率。

Example2:

Aa控制的遗传病为常染色体隐形遗传病。男子的基因型有两种可能1/3AA,2/3Aa。与基因型为Aa的正常女子结婚。生了一个正常的儿子。求这个儿子是Aa的概率。

Ans:

\(\frac{3}{5}\)

\(P(C|A\_)=\frac{P(C \bigcap A\_)}{P(A\_)}=\frac{P(AA)  \cdot \frac{1}{2}+ P(Aa) \cdot \frac{1}{2}}{P(A\_)}=\frac{\frac{1}{3}  \cdot \frac{1}{2}+\frac{2}{3} \cdot \frac{1}{2}}{\frac{5}{6}}=\frac{3}{5}\)

 

P(A_)为孩子正常的概率。P(C)为孩子为Aa的概率。P(AA)和P(Aa)是男子基因型的概率。

[Dr.Lib]Note:Math – Probabilities 概率

Via http://community.topcoder.com/tc?module=Static&d1=tutorials&d2=probabilities

It has been said that life is a school of probability. A major effect of probability theory on everyday life is in risk assessment. Let's suppose you have an exam and you are not so well prepared. There are 20 possible subjects, but you only had time to prepare for 15. If two subjects are given, what chances do you have to be familiar with both? This is an example of a simple question inspired by the world in which we live today. Life is a very complex chain of events and almost everything can be imagined in terms of probabilities.

Gambling has become part of our lives and it is an area in which probability theory is obviously involved. Although gambling had existed since time immemorial, it was not until the seventeenth century that the mathematical foundations finally became established. It all started with a simple question directed to Blaise Pascal by Chevalier, a nobleman that gambled frequently to increase his wealth. The question was whether a double six could be obtained on twenty-four rolls of two dice.

As far as TopCoder problems are concerned, they're inspired by reality. You are presented with many situations, and you are explained the rules of many games. While it's easy to recognize a problem that deals with probability computations, the solution may not be obvious at all. This is partly because probabilities are often overlooked for not being a common theme in programming contests. But it is not true and TopCoder has plenty of them! Knowing how to approach such problems is a big advantage in TopCoder competitions and this article is to help you prepare for this topic.

Before applying the necessary algorithms to solve these problems, you first need some mathematical understanding. The next chapter presents the basic principles of probability. If you already have some experience in this area, you might want to skip this part and go to the following chapter: Step by Step Probability Computation. After that it follows a short discussion on Randomized Algorithms and in the end there is a list with the available problems on TopCoder. This last part is probably the most important. Practice is the key!

Basics
Working with probabilities is much like conducting an experiment. An outcome is the result of an experiment or other situation involving uncertainty. The set of all possible outcomes of a probability experiment is called a sample space. Each possible result of such a study is represented by one and only one point in the sample space, which is usually denoted by S. Let's consider the following experiments:

Rolling a die once
Sample space S = {1, 2, 3, 4, 5, 6}
Tossing two coins
Sample space S = {(Heads, Heads), (Heads, Tails), (Tails, Heads), (Tails, Tails)}

We define an event as any collection of outcomes of an experiment. Thus, an event is a subset of the sample space S. If we denote an event by E, we could say that E⊆S. If an event consists of a single outcome in the sample space, it is called a simple event. Events which consist of more than one outcome are called compound events.

What we are actually interested in is the probability of a certain event to occur, or P(E). By definition, P(E) is a real number between 0 and 1, where 0 denotes the impossible event and 1 denotes the certain event (or the whole sample space).

As stated earlier, each possible outcome is represented by exactly one point in the sample space. This leads us to the following formula:

That is, the probability of an event to occur is calculated by dividing the number of favorable outcomes (according to the event E) by the total number of outcomes (according to the sample space S). In order to represent the relationships among events, you can apply the known principles of set theory. Consider the experiment of rolling a die once. As we have seen previously, the sample space is S = {1, 2, 3, 4, 5, 6}. Let's now consider the following events:

Event A = 'score > 3' = {4, 5, 6}
Event B = 'score is odd' = {1, 3, 5}
Event C = 'score is 7' = ∅
A∪B ='the score is > 3 or odd or both' = {1, 3, 4, 5, 6}
A∩B ='the score is > 3 and odd' = {5}
A' = 'event A does not occur' = {1, 2, 3}

We have:

P(A∪B) = 5/6
P(A∩B) = 1/6
P(A') = 1 – P(A) = 1 – 1/2 = 1/2
P(C) = 0

The first step when trying to solve a probability problem is to be able to recognize the sample space. After that, you basically have to determine the number of favorable outcomes. This is the classical approach, but the way we implement it may vary from problem to problem. Let's take a look at QuizShow (SRM 223, Div 1 – Easy). The key to solving this problem is to take into account all the possibilities, which are not too many. After a short analysis, we determine the sample space to be the following:

S = { (wager 1 is wrong, wager 2 is wrong, you are wrong),
(wager 1 is wrong, wager 2 is wrong, you are right),
(wager 1 is wrong, wager 2 is right, you are wrong),
(wager 1 is wrong, wager 2 is right, you are right),
(wager 1 is right, wager 2 is wrong, you are wrong),
(wager 1 is right, wager 2 is wrong, you are right),
(wager 1 is right, wager 2 is right, you are wrong),
(wager 1 is right, wager 2 is right, you are right) }

The problem asks you to find a wager that maximizes the number of favorable outcomes. In order to compute the number of favorable outcomes for a certain wager, we need to determine how many points the three players end with for each of the 8 possible outcomes. The idea is illustrated in the following program:

Another good problem to start with is PipeCuts (SRM 233, Div 1 – Easy). This can be solved in a similar manner. There is a finite number of outcomes and all you need to do is to consider them one by one.

Let's now consider a series of n independent events: E1, E2, … , En. Two surprisingly common questions that may appear (and many of you have already encountered) are the following:

  1. What is the probability that all events will occur?
  2. What is the probability that at least one event will occur?

To answer the first question, we relate to the occurrence of the first event (call it E1). If E1 does not occur, the hypothesis can no longer be fulfilled. Thus, it must be inferred that E1 occurs with a probability of P(E1). This means there is a P(E1) chance we need to check for the occurrence of the next event (call it E2). The event E2 occurs with a probability of P(E2) and we can continue this process in the same manner. Because probability is by definition a real number between 0 and 1, we can synthesize the probability that all events will occur in the following formula:

The best way to answer the second question is to first determine the probability that no event will occur and then, take the complement. We have:

These formulae are very useful and you should try to understand them well before you move.

BirthdayOdds
A good example to illustrate the probability concepts discussed earlier is the classical "Birthday Paradox". It has been shown that if there are at least 23 people in a room, there is a more than 50% chance that at least two of them will share the same birthday. While this is not a paradox in the real sense of the word, it is a mathematical truth that contradicts common intuition. The TopCoder problem asks you to find the minimum number of people in order to be more than minOdds% sure that at least two of them have the same birthday. One of the first things to notice about this problem is that it is much easier to solve the complementary problem: "What is the probability that N randomly selected people have all different birthdays?". The strategy is to start with an empty room and put people in the room one by one, comparing their birthdays with those of them already in the room:

This so called "Birthday Paradox" has many real world applications and one of them is described in the TopCoder problem called Collision (SRM 153, Div 1 – Medium). The algorithm is practically the same, but one has to be careful about the events that may alter the sample space.

Sometimes a probability problem can be quite tricky. As we have seen before, the 'Birthday Paradox' tends to contradict our common sense. But the formulas prove to us that the answer is indeed correct. Formulas can help, but to become a master of probabilities you need one more ingredient: "number sense" . This is partly innate ability and partly learned ability acquired through practice. Take this quiz to assess your number sense and to also become familiar with some of the common probability misconceptions.

Step by Step Probability Computation
In this chapter we will discuss some real TopCoder problems in which the occurrence of an event is influenced by occurrences of previous events. We can think of it as a graph in which the nodes are events and the edges are dependencies between them. This is a somewhat forced analogy, but the way we compute the probabilities for different events is similar to the way we traverse the nodes of a graph. We start from the root, which is the initial state and has a probability of 1. Then, as we consider different scenarios, the probability is distributed accordingly.

NestedRandomness
This problem looked daunting to some people, but for those who figured it out, it was just a matter of a few lines. For the first step, it is clear what do we have to do: the function random(N) is called and it returns a random integer uniformly distributed in the range 0 to N-1. Thus, every integer in this interval has a probability of 1/N to occur. If we consider all these outcomes as input for the next step, we can determine all the outcomes of the random(random(N)) call. To understand this better, let's work out the case when N = 4.

  • After the first nesting all integers have the same probability to occur, which is 1 / 4.
  • For the second nesting there is a 1/4 chance for each of the following functions to be called: random(0), random(1), random(2) and random(3). Random(0) produces an error, random(1) returns 0, random (2) returns 0 or 1 (each with a probability of 1/2) and random(3) returns 0, 1 or 2.
  • As a result, for the third nesting, random(0) has a probability of 1/4 + 1/8 + 1/12 of being called, random(1) has a probability of 1/8 + 1/12 of being called and random(2) has a probability of 1/12 of being called.
  • Analogously, for the fourth nesting, the function random(0) has a probability of 1/4 of being called, while random(1) has a probability of 1/24.
  • As for the fifth nesting, we can only call random(0), which produces an error. The whole process is described in the picture to the right.

NestedRandomness for N = 4

The source code for this problem is given below:

If you got the taste for this problem, here are another five you may want to try:

ChessKnight – assign each square a probability and for every move check the squares one by one to compute the probabilities for the next move.
DiceThrows – determine the probability of each possible outcome for both players and then compare the results.
RockSkipping – the same approach, just make sure you got the lake pattern correctly.
PointSystem – represent the event space as a matrix of possible scores (x, y).
VolleyBall – similar to PointSystem, but the scores may go up pretty high.

Let's now take a look at another TopCoder problem, GeneticCrossover, which deals with conditional probability. Here, you are asked to predict the quality of an animal, based on the genes it inherits from its parents. Considering the problem description, there are two situations that may occur: a gene does not depend on another gene, or a gene is dependent.

For the first case, consider p the probability that the gene is to be expressed dominantly. There are only 4 cases to consider:

  • at least one parent has two dominant genes. (p = 1)
  • each parent has exactly one dominant gene. (p = 0.5)
  • one parent has one dominant gene and the other has only recessive genes (p = 0.25)
  • both parents have two recessive genes (p = 0)

Now let's take the case when a gene is dependent on another. This make things a bit trickier as the "parent" gene may also depend on another and so on … To determine the probability that a dependent gene is dominant, we take the events that each gene in the chain (starting with the current gene) is dominant. In order for the current gene to be expressed dominantly, we need all these events to occur. To do this, we take the product of probabilities for each individual event in the chain. The algorithm works recursively. Here is the complete source code for this problem:

See also ProbabilityTree.

Randomized Algorithms
We call randomized algorithms those algorithms that use random numbers to make decisions during their execution. Unlike deterministic algorithms that for a fixed input always give the same output and the same running-time, a randomized algorithm behaves differently from execution to execution. Basically, we distinguish two kind of randomized algorithms:

  1. Monte Carlo algorithms: may sometimes produce an incorrect solution – we bound the probability of failure.
  2. Las Vegas algorithms: always give the correct solution, the only variation is the running time – we study the distribution of the running time.

Read these lecture notes from the College of Engineering at UIUC for an example of how these algorithms work.

The main goal of randomized algorithms is to build faster, and perhaps simpler solutions. Being able to tackle "harder" problems is also a benefit of randomized algorithms. As a result, these algorithms have become a research topic of major interest and have already been utilized to more easily solve many different problems.

An interesting question is whether such an algorithm may become useful in TopCoder competitions. Some problems have many possible solutions, where a number of which are also optimal. The classical approach is to check them one by one, in an established order. But it cannot be guaranteed that the optima are uniformly distributed in the solution domain. Thus, a deterministic algorithm may not find you an optimum quickly enough. The advantage of a randomized algorithm is that there are actually no rules to set about the order in which the solutions are checked and for the cases when the optima are clustered together, it usually performs much better. See QueenInterference for a TopCoder example.

Randomized algorithms are particularly useful when faced with malicious attackers who deliberately try to feed a bad input to the algorithm. Such algorithms are widely used in cryptography, but it sometimes makes sense to also use them in TopCoder competitions. It may happen that you have an efficient algorithm, but there are a few degenerate cases for which its running time is significantly slower. Assuming the algorithm is correct, it has to run fast enough for all inputs. Otherwise, all the points you earned for submitting that particular problem are lost. This is why here, on TopCoder, we are interested in worst case execution time.

To challenge or not to challenge?
Another fierce coding contest is now over and you have 15 minutes to look for other coders' bugs. The random call in a competitor's submission is likely to draw your attention. This will most likely fall into one of two scenarios:

  1. the submission was just a desperate attempt and will most likely fail on many inputs.
  2. the algorithm was tested rather thoroughly and the probability to fail (or time out) is virtually null.

The first thing you have to do is to ensure it was not already unsuccessfully challenged (check the coder's history). If it wasn't, it may deserve a closer look. Otherwise, you should ensure that you understand what's going on before even considering a challenge. Also take into account other factors such as coder rating, coder submission accuracy, submission time, number of resubmissions or impact on your ranking.

Will "random" really work?
In most optimizing problems, the ratio between the number of optimal solutions and the total number of solutions is not so obvious. An easy, but not so clever solution, is to simply try generating different samples and see how the algorithm behaves. Running such a simulation is usually pretty quick and may also give you some extra clues in how to actually solve the problem.

[Dream YOLO]NOIP2013提高组酱油报告

370/600。
还不错的成绩。
但。。。怨念的是。。D1T2的10分。。
D1
T1
显然是快速幂。。
T2
显然当上下两序列同序(对应数字排名相同时和最小。
由于要求同序所以说序列一的顺序是无关紧要的,按序列一排序后我们只需再把序列二排序即可。由于交换一次至多消除一个逆序对,答案就是按序列一排序后的逆序对数。
至今不知哪写傻了,10分。
T3
当时写了一个裸并查集。。60。。
和YY聊了聊才发现应该先跑一个最大生成树的。。
D2
T1&T2都是喜闻乐见的题型。
n^2可以做,nlogn可以做,线性也可以做。。
我们那考场一位大神两题写了五个线段树(T1真的很像线段树操作喂。。
T1推了15分钟差分,5分钟码代码,10分钟对拍。
T2第一眼看过去是DP,看了下数据范围不大对了。
思考了五分钟,发现可以用数据结构优化。
先码了一个裸DP,试了一下貌似还可以,于是又推了一会题目,没看出什么东西来。
然后看了看第三题,我估计只会爆搜了,于是发呆5分钟开始重新码T2。
稳妥起见先码了一个离散化,把h搞到1~n(还有一个原因是h=0的话zkw线段树会囧。。。这是我后来才发现的。)
写了俩zkw分别记录当前处理到的这一棵是目标序列中奇数位和偶数位的情况。然后两种情况分别算一次。
然后写树的时候开闭区间没注意。。调了20分钟。。七弄八弄就交卷了囧。D2倒是A了两题……

发自 WordPress for Android

[Dr.Lib]Note:Data Structures – 关于存图

想当年,我只会用邻接矩阵,MLE的感人肺腑啊!

看最短路算法时,很多遍历边的方法我都看不懂。于是专门刷了一下存图……在此做个回顾总结。

部分来源于NOCOW。在此感谢。

邻接矩阵

作为图的一种储存方式,实际上就是一个二维数组。即为记为map[i][j],map[i][j]为节点i到节点j之间的边的权值,若不存在边,则为-1或其他。

这种储存方式的优势是书写方便,但缺点是非稠密图空间效率不高。相比之下,稍微麻烦点的前向星,邻接表等结构更加快速。

邻接表

邻接表是图的一种链式存储结构,在邻接表中,对图中每个顶点建立一个单链表,n个顶点,就要建n个链表。

第i个单链表中的结点表示依赖于顶点vi的边。单链表中每一个结点称为表结点,应包括三个域邻接点域、链域和数据域。邻接点域,用以存放与vi相邻接的顶点序号;链域用以指向与vi邻接的下一个结点,数据域存储和边或弧有关的的信息。

但是,为什么一定要链表呢?

有vector为什么不用呢?

前向星

前向星构造方法如下:

读入每条边的信息

将边存放在数组中

把数组中的边按照起点顺序排序

链式前向星

嗯,排序,快排,nlogn。

你看看人家存边只要O(n)……(如果你会计数排序的话……也行

那么……链式前向星。

还有吗……

十字链表和邻接多重表……在OI中还真没怎么见过