Runups and cycles
Wildon's Weblog 2018-03-12
Let xs be a stream of values that can be compared for equality. Let ++ denote concatenation of lists. Assuming that xs is of the form runUp ++ cyc ++ cyc ++ ..., what is a good way to find the decomposition (runUp, cyc)?
The problem is more subtle than it might seem. If xs is produced by iterating a function, so the entries are [x, f x, f f x, ...] then there are several cycle finding algorithms that can be used. In general, an algorithmic solution is impossible, because the stream may be of the form cyc ++ rest, where cyc has arbitrary length and agrees with
rest for positions. In practice, we might know that any cycle has some bounded length, or be prepared to accept a small chance of misidentification.
In this case, the following Haskell code finds the first decomposition (runUp, guessedCycle) such that the stream is runUp ++ rest and rest, which has length at least t, is of the form guessedCycle ++ guessedCycle ++ rest'.
runupAndCycle t xs = runupAndCycleAux t bs (zip [1..] (tail xs))runupAndCycleAux t xs [] = error $ show (t, xs)runupAndCycleAux t xs qxs@((q, x) : _) = case lookup (map snd $ take t qxs) (take q tailsPositions) of Nothing -> runupAndCycleAux t xs (tail qxs) Just p -> -- the t entries starting in positions q and p agree; -- test if taking guessedCycle to be all entries -- between positions q and p-1 gives a decomposition -- of the required form let xsFirst = take (q-p) $ drop p xs xsSecond = take (q-p) $ map snd qxs in if and (zipWith (==) xsFirst xsSecond) then (take p xs, xsSecond) else runupAndCycleAux t xs (tail qxs) where tailsPositions = [(take t ys, p) | (ys, p) <- zip (tails xs) [0..]]
For example
-
runupAndCycle 4 [1,2,3,1,2,3,4,5,1,2,3,4,5]evaluates to([1,2,3],[1,2,3,4,5]); -
runupAndCycle 4 [1,2,3,1,2,3,4,6,1,2,3,4,5]raises an exception, because while there is agreement for four positions, this is shown by theif ... then ... else ...test not to correspond to a cycle (and the bistream is then exhausted without finding anything better); -
runUpAndCycle 4 [1,2,3,1,2,3,4,5,1,2,3,4,6]raises an exception for the same reason; -
runUpAndCycle 4 [1,2,3,1,2,3,4,5,1,2,3,4,5,6]evaluates to([1,2,3],[1,2,3,4,5]), as the first example, wrongly suggesting a 5-cycle. Replacing the first argument 4 with 6 we get an exception instead.
The algorithm may use work to exhaust
bits without finding a suitable decomposition. For example with
, the stream
[0,0,0,1,0,0,0,2,0,0,0,3,0,0,0,4, ...] finds possible cycle starts in every fourth position; by position , these require work
to investigate.
I’d like to know if there is a faster or more elegant algorithm—ideally both of course—but some searching only found hits for the deterministic version of the problem.