% VSHCLUSTER Vertex Substitution Heuristic for the p-median problem % [idx,netsim,dpsim,expref]=VSHCLUSTER(S,K) clusters data, using a set % of real-valued pairwise data point similarities as input. Clusters % are each represented by a cluster center data point (the "exemplar"). % The vertex substitution heuristic algorithm is adapted from an % implementation by M.J.Brusco and H.-F. Köhn based on: % P.Hansen, N.Mladenovic, Location Science 5, 207-226 (1997). % % The input is a pairwise similarity matrix S, where S(i,k) is the % similarity of point i to point k and S(i,k) needn't equal S(k,i). K, the % number of exemplars or clusters, is also a required input parameter. % % VSHCLUSTER automatically determines the number of clusters based on % the input preference 'p', a real-valued N-vector. p(i) indicates the % preference that data point i be chosen as an exemplar. Often a good % choice is to set all preferences to median(s); the number of clusters % identified can be adjusted by changing this value accordingly. If 'p' % is a scalar, VSHCLUSTER assumes all preferences are that shared value. % % The clustering solution is returned in idx. idx(j) is the index of % the exemplar for data point j; idx(j)==j indicates data point j % is itself an exemplar. The sum of the similarities of the data points to % their exemplars is returned as dpsim, the sum of the preferences of % the identified exemplars is returned in expref and the net similarity % objective function returned is their sum, i.e. netsim=dpsim+expref. % % [ ... ]=vshcluster(s,p,'NAME',VALUE,...) allows you to specify % optional parameter name/value pairs as follows: % % 'restarts' number of VSH restarts to use % 'preference' use a preference p like apcluster; or else p=diag(S) % 'details' (no value needed) Outputs extra details to console % function [idx,netsim,dpsim,expref] = vshcluster(S,K,varargin) start = clock; [I,J] = size(S); if (I==J) N=I; else error('Input similarity matrix S must be square'); end; K=double(K); if (K>N || K<1) error('K must be in range [1..N]'); end; if ~isa(S,'double'), S=double(S); warning('converting unknown S-matrix format to double-precision floating point'); end; p=diag(S); details=false; T=[]; % number of restarts while numel(varargin), switch(varargin{1}), case {'details','det'}, details=true; varargin(1)=[]; case {'nodetails','nodet'}, details=false; varargin(1)=[]; case {'nruns','restarts'}, T=varargin{2}; varargin(1:2)=[]; case {'pref','preference','p'}, p=varargin{2}; varargin(1:2)=[]; otherwise, varargin(1)=[]; end; end; if isempty(T), T=20; end; if numel(p)==1, p=repmat(p,[1 I]); end; T=double(floor(T)); if T<1, error('Number of restarts out of range (must be positive integer)'); end; M=max(max(S)); if M<=0, M=0; end; S=S-M; for i=1:N, S(i,i)=0; end; % make negative for algorithm to work idx = zeros(N,T); netsim = zeros(1,T); dpsim = zeros(1,T); expref = zeros(1,T); for t=1:T, ret=vshcluster_(S,K,0*S(:,1)); ret=double(ret); exs = ret; if length(unique(exs))N, error('something wrong with set of exemplars'); end; notexs = setdiff((1:N)',exs); [temp,idx_] = max(S(notexs,exs),[],2); idx(exs,t)=exs; idx(notexs,t)=exs(idx_); dpsim(:,t) = sum(temp) + (N-K)*M; expref(:,t) = sum(p(exs)); netsim(:,t) = dpsim(:,t)+expref(:,t); if details, if t==1, fprintf('Number of exemplars identified: %d (for %d data points)\n',K,N); end; fprintf(sprintf(' run%%0%dd (%%dsec): Net sim is %%f = %%f + %%f',floor(log10(T))+1),t,round(etime(clock,start)),netsim(t),dpsim(t),expref(t)); if netsim(t)>[-Inf; max(netsim(1:t-1))], fprintf(' <--best'); end; fprintf('\n'); end; end; return; function [exemplars, dpsim_vsh, netsim_vsh] = vshcluster_(s,K,p) % TEITZ & BART (1968) VERTEX SUBSTITUTION HEURISTIC (VSH) % USING FAST UPDATES FROM HANSEN & MLADENOVIC (1997), WHICH ARE % BASED ON WHITAKER'S (1983) WORK. % NOTE: For consistency with Frey and Dueck's (2007) affinity propagation, % code (www.psi.toronto.edu/affinitypropagation), we have % reverse-coded this program to operate on a SIMILARITY matrix, % rather than dissimilarity or distance matrices. The program can % read the similarity data (s) either in the form of an n x n matrix % of the three column form such as used in Frey and Dueck's test % problems for their Science paper (DocumentSummarization, Travel % Routing, Face Images). % INPUTS: s = an n x n symmetric or asymmetric SIMILARITY matrix, % either n x n directly or 3-column form like F&D. % K = the desired number of medians/clusters/exemplars % p = Frey and Dueck's (2007) preference vector. These values % have no bearing on our algorithm; however, we use them to % compute "netsim_vsh" for comparison to "netsim" from % affinity propagations % OUTPUTS: exemplars = the best found set of medians/exemplars % dpsim_vsh = the sum of the similarities between each point and its % nearest exemplar. COMPARE THIS VALUE TO "dpsim" % obtained using affinity propagation. BIGGER VALUES % of dpsim are better (more similarity / less error). % For example, for the face images data used in % Frey and Dueck's (2007) paper, the value of -9692 % obtained with vsh.m is better than the value of -9734 % produced by Frey and Dueck's affinity propagation % algorithm. % netsim_vsh = "dpsim_vsh" plus the sum of the exemplar % preferences. Although vsh.m optimizes dpsim_vsh, we % compute and report "netsim_vsh" for comparison to % "netsim" from affinity propagation. IT IS IMPORTANT % to note that when all preferences are the same (as % for most problems), netsim_vsh just augments % dpsim_vsh by a constant. If the preferences differ % across objects, then we recommend using "vsh_fc.m" % RECOMMENDATION FOR EASY COMPARISON TO AFFINITY PROPAGATION: % 1) Load in a test problem and run Frey and Dueck's affinity % propagation algorithm (apcluster.m), which will produce 'idx' and % 'dpsim' as output. % 2) Set K = length(unique(idx)) to use same number of clusters as % affinity propagation. % 3) Run this program (vsh.m) using s and K as input arguments. % 4) Compare 'dpsim_vsh' to the value of 'dpsim' produced by affinity % propagation, recalling that LARGER VALUES ARE BETTER % % COMPARATIVE RESULTS: Affinity Propagation Vertex sub. heur. (VSH) % K dpsim) (dpsim_vsh) % --- ------------ ----------------------- % 1. Document Summarization 4 -9582.03 -9429.90 % 2. Travel Routing 7 -60960.00 -60654.00 % 3. Olivetti Face Images 62 -9734.72 -9692.34 % 4. Fisher's iris data 6 -45.96 -43.98 % 5. Hartigan birth/death 6 -863.41 -863.41 % 6. Lin/Kernighan-drilling 11 -44979936.00 -44882137.00 % 7. European cities (202) 17 -2150.64 -2065.57 % 8. European cities (431) 16 -47706.65 -45361.12 % 9. European cities (666) 17 -131944.17 -127716.91 %10. Reinelts circuit board 22 -21798150.52 -21450616.61 % Note that the two algorithms produce the same solution for problem 5. % However, VSH produced a better solution than Aff. Prop. for each of the % remaining problems (i.e., a larger value indicating greater similarity). % state = 1; % fix state for testing % rand('state', state); P = p; p = []; p = K; [n1,n2] = size(s); if n1 == n2 % Is s a square matrix? n = n1; S = s; s = []; else % or 3-column data like Frey & Deuck? n = max(max(s(:,1)),max(s(:,2))); S = -9.9e+10.*ones(n,n); for i = 1:n1 i1 = s(i,1); i2 = s(i,2); S(i1,i2) = s(i,3); end s = []; end for i = 1:n S(i,i) = 0; end nreps = 20/20; gbest = -9.9e+12; fstore = zeros(nreps,1); c1 = zeros(n,1); c2 = zeros(n,1); for klk = 1:nreps s = randperm(n); % Randomly order the exemplar a1 = S(:,s(1:p)); % candidates and let the first fbest = 0; % p be the selected subset for i = 1:n; % Note: In this program dmax = -9.9e+12; % i indexes the rows, which for j = 1:p % assigned to exemplars or if a1(i,j) > dmax % selected columns indexed by j dmax = a1(i,j); jsel = j; end end fbest = fbest + dmax; c1(i) = s(jsel); % fbest = initial objective; end % c1(i) = exemplar to which for i = 1:n % object i is most similar dmax = -9.9e+12; % c2(i) = exemplar to which for j = 1:p % object is 2nd most similar jj = s(j); % is computed in this block if c1(i) == jj continue end if S(i,jj) > dmax dmax = S(i,jj); jsel = jj; end end c2(i) = jsel; end trig = 0; while trig == 0 % The exchange process begins here wstar = 0; % The block below finds, for each unseleted point (goin), the best exemplar % to remove (goout) if adding goin. The best (goin, goout) pair is % identified and (wstar) is the resulting change in the objective function. % If wstar >= 0, then the algorithm terminates because there is no viable % exchange that will improve the objective function further. for goin = p+1:n ii = s(goin); w = 0; u = zeros(n,1); for i = 1:n if S(i,ii) > S(i,c1(i)) w = w + S(i,ii) - S(i,c1(i)); else u(c1(i)) = u(c1(i)) + max(S(i,ii),S(i,c2(i))) - S(i,c1(i)); end end dmax = -9.9e+12; for j = 1:p jj = s(j); if u(jj) > dmax dmax = u(jj); goout = j; end end w = w + dmax; if w > wstar wstar = w; goinb = goin; gooutb = goout; end end if wstar <= .00001 trig = 1; continue end % The block below updates the c1 and c2 vectors if an (goin, goout) swap % results in an improvement. fbest = fbest + wstar; goinc = s(goinb); gooutc = s(gooutb); idum = s(goinb); s(goinb) = s(gooutb); s(gooutb) = idum; for i = 1:n if c1(i) == gooutc if S(i,goinc) >= S(i,c2(i)) c1(i) = goinc; else c1(i) = c2(i); dmax = -9.9e+12; for j = 1:p jj = s(j); if c1(i) == jj continue end if S(i,jj) > dmax dmax = S(i,jj); jsel = jj; end end c2(i) = jsel; end else if S(i,c1(i)) < S(i,goinc) c2(i) = c1(i); c1(i) = goinc; elseif S(i,goinc) > S(i,c2(i)) c2(i) = goinc; elseif c2(i) == gooutc dmax = -9.9e+12; for j = 1:p jj = s(j); if c1(i) == jj continue end if S(i,jj) > dmax dmax = S(i,jj); jsel = jj; end end c2(i) = jsel; end end end end if fbest > gbest gbest = fbest; sbest = s; end fstore(klk) = fbest; end dpsim_vsh = gbest; exemplars = sbest(1:p); % Store best solution netsim_vsh = dpsim_vsh + sum(P(exemplars)); fmean = mean(fstore); % Mean performance fmedian = median(fstore); % Median performance return