Saturday, 25 November 2017

Technical Problem to solve - "Blockfinder".

I have a technical problem that needs to be solved as part of a larger project that has been in the works for a while. Code development needed.


Objective

From a given set of tiles build a 2 x 4 block of 8 that have matching edges.


After encoding each tile pattern as four characters. A different 2 x 4 layout (not same as above) can be shown as:


On the top row the tiles join with keys K,G,G. Between the two rows of tiles joining keys are G,R,R,P. Between the bottom row of tiles the joining keys are R,G,G. In the centre of each tile is a description :

[ TileNumber r n ]   # n is rotation between 0..3

Which represents the tile number and rotation in the layout. The dot in the corner of each tile confirms the rotational layout between zero and three.  This layout would be summarised as 

B8:0x595c:T13R2,T15R3,T7R3,T9R3,T12R3,T4R1,T3R3,T5R0

Inputs

1) tile.txt file describes a set of tiles each wit4 sides. On each of four sides (north, south, east, west) is a key character/number. Tiles are to be assembled into a 2 x 4 layout where the side edge keys match. 

Sample tile.txt file :


#Tiles with Corner colours type Normalised ns we base rotation 28 Mar 13, 
#TileNumber,North,East,South,West,Tyle Type,Normalised,NS pairs,WE pairs,Base roatation
1,4,0,1,9,e,1,"4,1","0,9",1
2,8,9,8,8,i,1,"8,8","9,8",0
3,6,9,7,7,i,2,"6,7","9,7",0
4,0,1,9,4,e,2,"0,9","1,4",2
5,9,8,7,8,i,3,"9,7","8,8",0
                          | Base rotation needed to put edge sides downwards
                    ----- EW pair ( duplicate information ) 
              ----- NS pair ( duplicate information ) 
            - number of tile when each tile type starts at 1 
          - tile type i=interior, e=edge, c=corner
  N E S W - edge keys 1=a,2=b,7=g,8=h .. etc
N - tile number

Colour key 0 = outside edge of puzzle. Each edge tile will have one 0 edge, each corner tile has 2 of 0 sides.

2) A textfile that contains one Hex number on each line.
Each line is a Hex value that has 8 bits set within a 40bit value (for a 40 tile deck).  Max Deck size is currently 64 but may go up to 256 in a future project. 

EG: A textfile will look like these lines ( order is not important )


0x1242408300
0x1244000318
0x1244000b08
...

The set bits in the number represent the number of the tile in the tile file that is to be used in the (Block 8 tiles ) B8 layout. Hex to binary conversion shows the tiles for this sample ....

$ bc -l
bc 1.06
ibase=16   # << set hex input
obase=2    # << set binary output
C          # << input test number
1100       # << result
1242408300 # << actual hex number
1001001000010010000001000001100000000   #<< binary result

Rightmost bit is counted as "1" to match tile number 1 in the tile file.
<< MSB --------                       ----- LSB >
40   36   32   28   24   20   16   12   8    4  1  << bit number
|    |    |    |    |    |    |    |    |    |  
...1 ..1. .1.. ..1. .1.. .... 1... ..11 .... ....  << Extracted bits
...| ..|. .|.. ..|. .|.. .... |... ..|| .... ....
0001 0010 0100 0010 0100 0000 1000 0011 0000 0000  << Binary number
   1    2    4    2    4    0    8    3    0    0  << Hex digits 1 for each 4 bits                                                                                

Tiles in sample 0x1242408300  layout would be 

9, 10, 16, 23, 26, 31, 34, 37

3) A clue string is provided in the format "aaaa/bbbb/ccdd" which represents the edge keys on the top most unmatched edge, bottommost unmatched edge and the two shorter sides. For the layout and above the clue string would be :
rwbr/kook/rogr

When a B8 tile layout and its rotations are found the clue string is checked to ensure that the outer boundary keys match the clue string. Layouts that correctly represent a B8 but fail the clue string should be reported.  All the lines should have at least one layout that matches the clue string.

Task
Find a B8 layout of tiles for each hex number in the textfile.

A B8 layouts is reported using the following single line notation.  A layout and other possible outcomes would be reported as: 

B8:0x1242408300:T13R2,T15R3,T7R3,T9R3,T12R3,T4R1,T3R3,T5R0
B8:0x1242408360:Fail Too many bits
B8:0x1242408b08:Fail No layout found
B8:0x1242408b08:Fail Clue string mismatch: T13R2,T15R3,T7R3,.....

Outline size of task

Arrange and rotate 8 tiles to find layout which has inner edge matches and outer face matches with a clue string.

Given eight tiles there are 
8*7*6*5*4*3*2*1  = 8! = 40,320 possible tile layout orders.

Each tile can have 4 possible rotations leading to 
4*4*4*4*4*4*4*4 = 4^8 = 65,536 different sets of rotations

Multiplying rotations and layout orders there are :
2,642,411,520 possible ways of organising eight tiles.

Time for 2.6 billion simple loops in Perl and c++
$ time perl -e 'foreach(1..40320){ foreach (1..65536) {$t+=1}};print"t=$t\n";'
t=2642411520
real 2m35.544s
user 2m35.265s
sys 0m0.165s

$# Same loops coded in c++ run in about 4% of the time.
$ make cpptime ; time cpptime
c++     cpptime.cpp   -o cpptime
t=2642411520
real 0m5.930s

user 0m5.913s


Not all of these need to be tested as tiles can be checked against their neighbours before rotations in place are attempted. If a tile has none of the same edge keys as its neighbour then they would never work next to each other in the same layout. A logical test can be constructed to see if any 8 tile layout is likely to actually work before rotations are applied.

Hint

There are two distinct stages to this problem. First is to find out specific layouts of cards that could possibly join together, and secondly to find the rotations that fit any given layout into a solution.
If possible layouts are generated in standard permutation fashion as follows:
1 2 3 4 5 6 7 8  
1 2 3 4 5 6 8 7  
1 2 3 4 5 7 6 8  
1 2 3 4 5 7 8 6  

A test can be performed to see if cards 1 & 2 can possibly match, if not then the next 5041 layouts starting 1 2 can be discarded as well as the 720 after that that start 2 1.  Similarly for columns 2 and 3, large jumps of 720 rows can be made down the rows of permutations without having to test a layout in depth.

Column - Number of rows with the same number in the that column.
[1] - 5040 
[2] - 720
[3] - 121
[4] - 24
[5] - 6
[6] - 1
[7] - 1

As can be seen the first row that starts 2 1 is on row 5041.
 Row
Number    Perm order



Technology to be used


  • Perl and/or c++
  • Multi-tasking via threads, openMP or standard libraries encouraged 8 or 12 way processors available.
  • 8GB memory, 5TB HDD and/or 250GB SSD.
  • Typical time to solution 0.01 second or better per layout.
  • Any c++ should be "standards compliant" and self contained at src level and (highly desirable) to be adjustable to work on W10 or Mac or Ubuntu.
  • Code should be structured so that it can be trivially built into a larger program or as stand alone utility.



    Tuesday, 21 November 2017

    The warning and the fail, ICO investor cash vanishes.

    Investing in new areas of complex technology is perilous even for serious knowledgeable investors. A recent round of crypto currency launches known as ICO initial coin offerings caused the UK financial conduct authority to issue this warning.
    The term ICO refers to a digital way of raising funds from the public using a virtual currency, also known as cryptocurrency. An ICO can also be known as ‘token sale’ or ‘coin sale’.
    ICO issuers accept a cryptocurrency, like Bitcoin or Ether, in exchange for a proprietary ‘coin’ or ‘token’ that is related to a specific firm or project. ICOs vary widely in design. The digital token issued may represent a share in a firm, a prepayment voucher for future services or in some cases offer no discernible value at all. Often ICO projects are in a very early stage of development.
    ICOs are very high-risk, speculative investments.
    From : Financial conduct authority announcement

    No sooner said than done. 
    Confido tokens had a market cap of $10 million last week, before the company disappeared, but now the tokens are worthless. And investors are crying foul.
    Apparently the company pulled the shutters down by deleting all their social media accounts and leaving their backend administrators "Tokenlot" holding the empty bucket. A disappointing state of affairs but such occasional events are unsurprising considering the wild west frontier that underlines underlies bitcoin and similar technologies. Previously stories of stolen millions, closed exchanges and lack of global regulation also dent confidence in this in emerging technology.

    Personally I like my investments, transparent, regulated and income producing.

    Update : 21 Nov 2017 31 Million $ lifted from Bitcoin operator Tether


    See this Hufpost article for information re this complex area. 







    Friday, 17 November 2017

    Excellent series for the STEMM enthusiast

    This excellently produced series of books, branded under the Sterling Milestones banner follows a consistent format. Each book contains 250 individual ideas encapsulated in a picture and a page of text. Organised in roughly time order each book expands the core ideas and discoveries in a STEMM related field.  

    My personal favourite is The Space Book which starts with the big bang, rolling through early discoveries of the planets and calendars towards current activities in space and onwards to how the universe will end. Full of facts and contextual pictures this book really brings alive that unique combination of engineering and exploration which is the discovery and exploration of outer space.

    Each book starts with an introduction to the field and finishes with a comprehensive index, Notes and further reading followed by photos credits. Most are offered by Chris Pickover but other authors have some of the titles.

    For more details Sterling milestones books 



    Is available via Amazon or ABEbooks as usual.

    Friday, 10 November 2017

    Some of our favourite software died today

    Today 10th Nov 2017 the house iMac was upgraded to OSx 10.11 El Captain having happily run on OSx 10.6 Snow Leopard since in in 2009.   Snow Leopard was the last version of Mac OSx that would run PowerPC code ( in emulation mode) so this lays to rest QuarkXPress version 6.5 that was the last vestiges of DTP software used for the other halfs' typesetting business.  Purchased as box software many, many years ago Quark has provided good service but is no longer required. Running OS that is over five years old, that no longer get security updates, is also a bit of risk. Also expiring was the last free version of Mac the Ripper DVD content extraction tool. I have to admire Apple by making it so easy and smooth to do an OS upgrade effectively jumps forward five operating system releases in one go.

    The smaller handheld Apple devices upgraded to the latest version of IOS 11.0 recently and this has obsoleted early Apps. My favourite APP was ConvertBot, a little gadget that converted numbers between different units. The software was so well designed and looked so slick that for me it became an early demonstrator for Apple's approach to building a software ecosystem. Many other Apps that have not been updated in the last year or few will also have expired and become unusable. More information here.



    Monday, 6 November 2017

    Coffee jar plastic top madness

    I like a cup of strong flavour instant coffee, I also like to recycle and encourage reduction in packaging. 

    In a fit of packaging madness Jacobs Douwe Egberts  lo'r has this most ridiculous plastic top on it's instant coffee. 


    551g Total Jar  made up of 
    347g of glass          -> 62.3 %
    37g   of plastic top ->   6.7 % 
    167g of product     ->  30.3 % 

    Looking by height 
    19cm Total Jar  
    16cm Glass     -> 84 %
    6.5 cm plastic top -> 15%   which overlaps jar by 3cm and has a further 3cm on top of that.



    This type of plastic top is typically non-recyclable ( unlike the glass ) and has no recycling marks identifying the plastic. There is also a plastic film wrapping the glass, which may look nice but does nothing for the recycling process.

    Is there a more effective packaging solution ? Yes but the economics don't look great on the latest Morrisons shopping trip.

    L'or = £2.34 per 100g
    Nescafe gold refill pack = £3.00 per 100g   ( no glass, minimal plastic bag )
    66p more expensive for packaging free refill option. Sigh.




    Even worse for the environmentally minded Morrisons on-line shopping currently has the refill pack 50p / 100g more expensive than the glass jar version.  6 Nov 2017.

     


    Madness.


    Further advice on Recycling Etiquette. 


    -----
    Reply from L'or
    Thank you for contacting L’Or.
    The lid is made of plastic and is recycled through standard plastic recycling bins.
    Kind Regards


    C**** M******

    Consumer Relations Team

    Friday, 3 November 2017

    Technical Problem to solve - "Five in a bed" Key matching across lists.

    I have a technical problem that needs to be solved as part of a larger project that has been in the works for a while. Code development needed.

    Starting conditions
    Five text files {A..E} that each contain one number on each line.
    Each line is a Hex value that has 8 bits set within a 40bit value.
    EG: Each file will look like these lines ( order is not important )
    0x1242408300
    0x1244000318
    0x1244000b08
    ....
    0x1000000d2d
    0x100000152d
    0x1000001c1d
    ....
    0xfa402000
    0xfa600000
    0xfa80010
    0xfc000480
    0xfc101
    0xfc200080
    0xfca00

    The files have between a 5000 and 500,000 lines. On average about 30,000 lines.

    Task
    Find one or more solution groups of lines, one from each file, that when OR’d together have all 40 bits set (0xFFFFF) but when AND’d together have 0 bit set (0x00000). The answer would deliver results as
    {FilenameA,0xvalue,FilenameB,0xvalue,FilenameC,0xvalue,FilenameD,0xvalue,FilenameE,0xvalue}
    repeated for each solution set found. Have the ability to stop after first solution or provide multiple answers is desirable.

    Outline size of task - ( updated )
    A simplistic approach would take all the lines from the first file and compare with all lines from second file and keep the matching pairs that are then compared against 3rd file and so on. This scales badly as up to 30,000 ^ 5 = 24,300,000,000,000,000,000,000 worst case comparisons may be needed. Further thought reduced this number as any failed comparison between the first two keys would remove the need for further comparisons for that key pair.  Also when 4 keys have been matched locating the last key becomes a trivial look-up for existence in the last key file. To get 4 keys we can generate keys pairs from files A&B and separately from files C&D then just merge the key results. This process would need 3 comparisons making worst case about 30,000 ^ 3 = 27,000,000,000,000 comparisons to find all the matching sets. As we only need 1 matching result 5 key set set, then if we can use a progressive algorithm that does not need all the results from one stage before starting the next stage, the actual number of comparisons would be much less.

    Test runs show that from files with 8,172 and 438,761 entries  517,183,384 matching pairs are generated (approx 14% hit rate). The normal size of the files is 30,000 entries.  If this 14% success ratio persists then we would have in the normal case

    (( 30000 * 30000 )* 2 * 0.14 ) 
    252,000,000

    comparisons and final stage lookups.  Which seems do-able.

    Structure of answer
    Read and prepare file data into data structures.

    One possible solution would be to run matches between files A&B and separately files C&D to find second stage candidates. Join the results lists and look-up against file E for final matches. This is one approach but I am open to other possibilities.

    It is expected that some sort of pre-organisation of the file keys would be used to reduce the need to compare each key with each other.  Possible some sort of Bloom filter or key grouping ( This is the needed magic bit. )

    Technology to be used.
    Perl and/or c++
    Multi-tasking via threads or standard libraries encouraged 8 or 12 way processors available.
    8GB memory, 5TB HDD and/or 250GB SSD.
    Typical time to solution 5 minutes or better.
    Any c++ should be "standards compliant" and self contained at src level and (highly desirable) to be adjustable to work on W10 or Mac or Ubuntu.
    Code should be structured so that it can be trivially built into a larger program or as stand alone utility.

    -------
    Done a perl prototype.
    Usage 
    5iab.pl {-v n -p n} -x n Afile Bfile Cfile Dfile Efile
    Each file has a list of numbers 1 each on a line.
    Find the lines in the files that have no overlapping bits, IE when & togther make 0. 
    -v n  Set verbose level n default 0; 
    -p n  Dump partial results after n keys and save mem 10000 is recomended for 2.5GB usage  
          memMode is set from -p and mode =4
    -x {1,2,3,4} Run using files as intermediate result holders
                Run Mode x=1 work on Afile Bfile (default) , x=2 work on Cfile Dfile ,x=3 Murge previous done runs with Efile
        Run Mode x=4 Do all stages in this single process, sets memMode=0;
    -x 5         Run as in memory solver all stages.

    Hint: For shorter run times put the file on the command line in increasing size order.

    Normal run would be 
    5iab.pl -x 1 fileA fileB fileC fileD fileE &
    5iab.pl -x 2 fileA fileB fileC fileD fileE &
    wait # till both have finished 
    5iab.pl -x 3 fileA fileB fileC fileD fileE # which will pick up the results and do final murge.

    or


    5iab.pl -x 4 fileA fileB fileC fileD fileE # may run out of memory on big sets






    Thursday, 2 November 2017

    Bike stand fail - fixed

    I have a Ridgeback Flight 3.0 bike kitted out for touring mode but it has a falling over problem.




    Quite simply fixed by moving stand attachment forward and cutting off corner of stand to fit.

    Before picture

    After ...


    The stand was only £5 on Amazon and other more adjustable expensive ones are available.